Monday, 21 July 2014

Class 4 is now available

The six lessons for Class 4 are now available on the course website:

This class introduces some more advanced methods and techniques. The topics are:
  • 4.1: Classification boundaries
  • 4.2: Linear regression
  • 4.3: Classification by regression
  • 4.4: Logistic regression
  • 4.5: Support vector machines
  • 4.6: Ensemble learning
The last three are high-performance contemporary algorithms. I aim to give you a conceptual understanding of what they do and how they work, but not the gory details. You have to learn to live and work in a world where you don't understand everything. You will see some mathematics in Lessons 4.2, 4.3 and 4.5. But don't worry: I'll explain it, and anyway you don't have to fully understand the math. 

Next week is the last. And it's short: Class 5 has only 4 lessons, not the usual 6. And it's more relaxed: no math at all.
cheers, and keep going!

Monday, 14 July 2014

Class 3 is now available

The six lessons for Class 3 are now available on the course website:

After this week there are 2 weeks to go (classes 4 and 5).

The mid-course assessment is also now available. Do it when you have finished Class 2 (although it will remain open for the rest of the course). The final assessment will appear during week 5.
My goal is to enable you to learn as much as possible from this course, and I recognize that doing the assessments may not be a priority for you. However, our ability to mount follow-up MOOCs will depend on the success of this one as perceived by my University -- and the number of people who complete it successfully will be a key metric. Thus I urge you to do the assessments for my sake, if not your own :-)

cheers, and keep going! Weeks 3 and 4 are the central part of this course.

Monday, 7 July 2014

Class 2 is now available

The six lessons for Class 2 are now available on the course website:

Following that, there are 3 weeks to go (classes 3, 4 and 5).
The activities are a crucial part of the course: they're where most people will do their actual learning! However, they do not form part of the assessment, so don't be scared to get wrong answers.

"Data Mining with Weka" has been designed so that participants at many different levels can learn as much as possible – and complete the course successfully. You don't have to do the reading. All you must do to succeed
are the mid-course and final assessments -- which you can try as often as you like. The mid-course assessment will become available this week (9 July) and remain open for the rest of the course. The final assessment will appear during week 5.

cheers, and keep going!


Monday, 30 June 2014

Welcome to "Data Mining with Weka"

Welcome to the course "Data Mining with Weka". The six lessons for Class 1 are now available on the course website

We will release classes 2, 3, 4, and 5 at approximately the same time (Monday noon NZ time) in the upcoming weeks, and send reminder announcements.
The course includes the following resources:
  •  the Weka software; Lesson 1.2 gives downloading instructions (we are using version 3.6.11)
  •  videos, one per lesson, on YouTube
  •  the videos include captions (English and Chinese), which can be turned on in YouTube
  •  we recommend viewing in HD format, again a YouTube control
  •  slides used in the videos (PDF format)
  •  text files containing transcripts of the videos
  •  activities that follow each lesson
  •  access to selected excerpts from Data Mining (3rd Edition) - plus you can buy a discounted copy from the publisher
  •  mid-course assessment (opens  9 July, in Week 2) 
  •  final assessment (opens  28 July, in Week 5)
  •  announcement forumblogtwitter feed (available from the course website)
  •  discussion forum: Teaching Assistants will be available to help you. Primarily this will be in English but some of the Assistants have said they can also help in other languages.

for Chinese participants:
  •  videos on Youku
    • one version with captions in Chinese (another with English captions is available on our Youku channel)

Some notes for participants:
  •  work through the videos and activities at your own pace, in your own time
  •  a new class appears every week; old classes will remain available until the course closes
    •  in theory, you could leave all your learning to the last week (we don't recommend this!)
  • please subscribe to the announcement forum if you haven't already done so: this is the best way to stay up-to-date with the course (click on
  • only the mid-course and final assessments count towards the Statement of Completion
  •  feel free to install Weka in advance, but please ensure that you have version 3.6.11
  •  if you already know something about Weka, feel free to skip the first class (or two)
  •  during the videos, it may help to follow with Weka on your own computer ("click along with Ian")
  •  the course should take 2–3 hours/week (3–4 hours if you do the optional reading).
Please help us by filling out the pre-course survey if you have not already done so.
cheers, and good luck

Friday, 27 June 2014

Volunteer Community Teaching Assistants

We would like to invite volunteers to be Community Teaching Assistants for the next session of Data Mining with Weka. Two types of people are likely to be effective in this role:
  • learners who completed the first session of Data Mining with Weka
  • existing Weka users
For those who did take the first MOOC then the Community Teaching Assistant role is some of the work performed by Peter. It is not so much answering questions as providing responses that encourage the learners to solve their own problems.

This is a call for volunteers, we don't have the resources to pay you anything - but we will produce a special version of the Statement of Completion for the Community Teaching Assistants.

To volunteer you should email wekamooc[at] with a summary of your experience (which can simply be that you completed the first run of the MOOC). The session starts on 30th June 2014.

Wednesday, 25 June 2014

Enrolment opens for "Data Mining with Weka"

Enrolments have opened for a new session of Data Mining with Weka:

The course will start on 30 June 2014 and extends over 5 weeks. It features:
You can also follow the course via Twitter or the blog:

Tuesday, 10 June 2014

Closing thoughts

The MOOC ended last week with Class 5. It will remain open until the end of Wednesday, 11 June (all time zones). Statements of Completion will be emailed a few days after the course closes. Please check your name (and marks) in the My Profile section of the course website: we will use that name on the Statement of Completion

The course material will remain available up indefinitely at:

under a Creative Commons Attribution 3.0 Unported (CC-BY 3.0) license. Use it however you like! We've also added the music, as an MP3 file, to the course material.

We are planning to re-run Data Mining with Weka beginning on 30 June. If you would like to be a volunteer Community Teaching Assistant for that course then watch out for an announcement about how to participate.

We will also re-run More Data Mining with Weka sometime, not sure when.

A survey about the course is available at:

Please fill it in! The response rate, and your feedback, as well as the course completion rate, will no doubt influence our ability to mount future courses.

Those who have completed this course are now experts on Weka and data mining. One person said "I felt slightly surprised taking this course at how basic my understanding actually was from the first class. Hopefully, there wouldn't be quite as much of this feeling in the next course.” You’re right, the introductory course was rather simplistic, but this one has been thorough. You have learned a lot, and you really are an expert now.

Nevertheless, there is more to be learned! Quite a lot of interest has been expressed in a further course, Advanced Data Mining with Weka. This may happen, but it will not happen soon: I am taking an extended 3-month holiday from the University (and from MOOCs); after that it will take many months to prepare a new MOOC. In the meantime, if you are interested in scripting, there is a new 
Weka wrapper for Python.

You can read more about Waikato's machine learning research group at:

Our publications are listed under the Publications tab.

In the good old days, Weka was an externally funded research project. But that ended long ago. Both Weka and this MOOC are supported entirely by our Department and University. If you think these efforts are worthwhile and would like to support them financially, that would be lovely! Please do so here:

All donations are directed to research: no administrative charges are incurred.

Finally, how about coming to Waikato to study? Our Department's web site at

has links to our research groups, and to graduate student information (MSc and PhD).

Excuse the advertising :-). Hope you had fun with the MOOC. Don't forget the survey. See you again! — perhaps in Advanced Data Mining with Weka.