We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.

Cal Lewis • 1 month ago

Hi Lee,

Your plan, as outlined, sounds great. Over the last few years I've taken courses from edX, Udemy, and DataCamp and have completed 95+% of them. I prefer the DataCamp presentation overall, the video segments are short, < 5 min, good questions after the video but they tend to prefill too much of the answer. The basic courses have been very basic, the intermediate and advanced courses are more challenging and interesting. All my study has been in R, Python, and various flavors of SQL; haven't used Excel in years.

This is the right course for me but I have a few questions: How much do we need to know about constructing a valid survey? Is survey analysis similar to or same as sentiment analysis? Is part of the analysis to detect if a statement or question has built-in bias? Also, are choices on a survey question linear or assumed to be linear? Are survey questions neutral or assumed neutral so that the person answering the question is not nudged in one direction or other?

A lot to be learned here, I'm looking forward to it.

Lee Baker • 1 month ago

Thanks for sharing your thoughts Cal.

Your questions about contructing valid, un-biased, neutral survey questions are very relevant. We will be doing a little work on this in the course, but I won't be going into too much depth. Constructing surveys and survey questions is as much about psychology as anything else, and is not an area I'm too comfortable with.

In these courses I'll be focusing mainly on how to analyse the data you have rather than on how to ask correctly-formed questions to gather the data (which is fiendishly complicated all on its own!).

Also - survey analysis and sentiment analysis are quite different. You use survey analysis to analyse the quantitative data collected in tick-boxes, Likert scales, etc., while sentiment analysis can be used to analyse the emotional content of qualitative data found in free text areas of a survey (or social media posts, medical reports, etc.).

Hope this helps.

Steve Smith • 1 month ago

Hi Lee,

Learning objective: My mission statement is taken from my prior comment "is to understand linear and non-linear analyses of categorical data with surveys that typically include data with multiple response categorical variables (aka MRCV) which are questions with "answer all that apply" responses." In doing so, linear would be the first step followed by non-linear methods and their advantages - the latter may be a separate course or courses.
Content: data sets and descriptive stats how and why use them together (find/correct issues). Inferential statistics analysis of variance, multiple regression, factor analysis, etc. and a taste of modeling - e.g., how inferential results can facilitate modeling.
Gaps: non-linear statistics, modeling - I understand the mechanics of setting up the data and running an analysis, but not the underlying statistics. Its been a long time since I studied inferential stats but am starting to use it more.

Lee Baker • 1 month ago

Great mission statement Steve!
When we start the course I'll be asking you to make a note of this and keep referring back to it to help keep me on track.

Steve Smith • 1 month ago

Thanks Lee and not a problem.

Steve Smith • 1 month ago

Hi Lee,

Plan: Yes, it sounds like a course I would enjoy.
Course completion: I complete ~90% of the courses I sign up for - they are typically part of a professional cert, hence the incentive. The 2 courses I did not complete - that was my intention - I was curious about the material only. Smaller, more entertaining courses and "doing vs listening" fits me perfectly.
Data: Categorical survey data is what I use most often. My end goal is to understand linear and non-linear analyses of categorical data with surveys that typically include single response and multiple response questions combined in a single survey; a common type, for example, is where the single response might be demographics and multiple response might be several questions each with its own set of "answer all that apply" responses.

Thanks for your video and getting this started.

Lee Baker • 1 month ago

Great - thanks for your responses Steve, you've definitely given me food for thought.

Note to everyone reading this:

Do you also have difficulties with those that Steve is pointing out, or different ones?

Chris Barnes • 1 month ago

I have also taught postgrad students to analyse survey data, which appears to be non-intuitive for almost all. I am interested to see what others (you) include, what content and in what order. I agree with your comments on online courses being too dry and too long in general, especially your Lego comments. I have a similar approach to learning/teaching mathematics. So almost any approach I expect to be valuable, and I will attempt to ask constructive questions or make constructive comments as applicable. Congratulations on your endeavour!

Lee Baker • 1 month ago

Thanks for the encouragement, Chris, I really appreciate it.

One of the biggest issues with learning how to analyse survey data is "my boss tells me to use the Gami-Wiki-Super-Duper-Whizzbang Test, but I don't know what that is". First, you need to learn to understand the data. Learn the simple stats tests that get you 80% of the way. Then once you've got the fundamentals you'll be in a much better position to understand the harder 20% of the stuff.

Too many people skip the easiest 80% and jump right to the hardest 20%. That's why they think stats is hard!

In my courses I'm going to be focusing on getting the fundamentals right, so you're ready for the hardest 20% when it crops up.

Christopher S. Sneed • 1 month ago

I think you should cover the Likert scale ordinal data first. Some research I came across as we worked on a survey about 12 months ago:


An empirical study on the transformation of likert-scale data to numerical scores by Chien-Ho Wu, Applied Mathematical Sciences, Vol. 1, 2007

Analyzing and Interpreting Data From Likert-Type Scales by Gail M. Sullivan, MD, MPH, and Anthony R. Artino Jr, PhD:
"...parametric tests can be used to analyze Likert scale responses. However, to describe the data, means are often of limited value unless the data follow a classic normal distribution and a frequency distribution of responses will likely be more helpful."

Indeed, we used Frequency Histograms to display many of the results and they told the story of the survey very well. There were actionable take aways using this approach.

Lee Baker • 1 month ago

Thanks for your response Chris - I really appreciate it.

We will be dealing with Likert scales in the course, but just as a quick hint - Likert scales should be analysed as Ordinal data, but can be analysed as continuous as long as the underlying distribution is Gaussian.

Hope that helps a little...

M Lau • 1 month ago

Hello Lee,

I wish to learn how to analyse survey data especially those with LIKERT ITEMs, which is ordinal data. I hope you can teach this.

As for which software to use, it's great to start with Excel because it's easy to understand. Thank you.

Lee Baker • 1 month ago

Yes, Ordinal data will be dealt with, whether they come from a Likert scale or not - Ordinal data are very important and will be in the Cateorical data course.

Thanks for stopping by, and I look forward to hearing more thoughts from you.

M Lau • 1 month ago

You might not have noticed I changed my comments 8 days ago --- Likert Scale has been changed to Likert Items. I thought Likert scale is irrelevant to me since it refers to that kind of survey used in psychology. I am only interested in Likert Items, which is a range of response options like these:
Strongly disagree, disagree, neutral, agree, strongly disagree

Lee Baker • 1 month ago

No worries - we'll be covering these data in the course

Jaap Karman • 1 month ago

Thanks for the video. Stopped to give an answer whether ii would be the right course.
Answer (1): it depends didn't yet have the questions at my running project in line with these. Got in now at: structuring, modelling data, removing waste (lean) in the process getting to delivet to analysts.
Answer (2); completed several that lasted several weeks many hours. Goal alignment wiht taks doing. Completed several that were mandatory at the job.. (above 80%).
Answer (3); categorical (string) numerical (floating) that is question in the data preparation to get correct.

Lee Baker • 1 month ago

Thanks for your thoughts Japp, it's great to have you here!

The next video in the series will be here very soon, and I look forward to hearing your thoughts about that one too...

Tyrone Patterson • 1 month ago

Thanks for the video Lee. CAT data for me.
For the courses I have registered for I have a ZERO completion rate. Often I have a good fundamental understanding of the basics and look to expand on that knowledge by jumping in mid way into a course. With regard to statistics courses, many continue to 'dish up' the usual on descriptive stats and normal distribution reliant methods. Methods that progress past this, describing the usefulness and common application would be of benefit. There is a significant gap between basic stats and machine learning which seems to be how it is online.

Lee Baker • 1 month ago

I know what you mean about stats courses - I have found so many that just regurgitate academic texts without explanation of why you should be doing a certain analysis, or how to interpret the results correctly.

My courses will be hands-on so you get experience as you go - so you will learn to make consistently better data decisions!

Thanks for your thoughts Tyrone - great to have you here!

Edward Waiyaki • 1 month ago

Hi Lee, my key areas of concern are:

1. How to formulate a data analysis plan (usually prior to data collection..?)
2. How to establish the type of data a research objective requires
3. The sequential steps (procedure) followed when conducting a survey data analysis

Lee Baker • 1 month ago

Your first 2 points are covered in my course Statistics - The Big Picture, while the steps to follow during a survey data analysis will be covered in the new course that I'm proposing here.
Hope you'll stick around and follow along...

Florens • 1 month ago

Analysing survey data in Excel & R is probably not the right course for me to be honest. I have been using R rather than Excel for a while now, and am starting to like Python for some computationally intensive applications. Also learning C++ just to be able to.

I'm sure that I'm an anomaly but my completion rate is near 100% for online courses... Maybe I picked the ones I was really interested in, or maybe I'm just wheird. I'm not counting the courses I've purchased but haven't started yet. I got quite a few on a large discount - just like you said - on Udemy a while back and have finished those I've started. Took me a while though.

Your 6-step plan sounds good. It's hard to keep people engaged and involved if you're not face to face, especially when they can't ask you anything directly when they get stuck. Also learning by doing is a good idea in general, especially in areas where you apply knowledge and technology. A lot of courses I've done take quite some time explaining things and then only apply them partially; explaining an algorithm and then just using a package or similar to use the algorithm - not code it yourself - is not very instructive.

The choice between categorical and numerical data is not for me I guess, as the course is not for me.
I forgot my password so I'm posting this as a guest.

Florens de Wit

Lee Baker • 1 month ago

Well done for an excellent course completion rate - keep it up!

I appreciate the feedback on the plan of action for the course - I aim to get everybody up to the 100% completion rate that you've set as the benchmark.

Edward Waiyaki • 1 month ago

Using a 'guesstimate', I'd say I've completed only 50% of the online courses I've taken. Yours sounds like an excellent plan and I'm really eager to grapple with the data analysis 'monster' that has terrified me for so long!

Lee Baker • 1 month ago

Compared with a typical completion rate of 3%, at 50% you're doing really well!
I'm aiming to flip the death spiral upside down and turn 3% into 97% - let's see how I do...

Edward Waiyaki • 1 month ago

Greetings Lee, thanks very much for offering to provide a Data analysis course. I am very confident that learning how to analyse data will greatly enhance my professional productivity - especially with respect to producing scientific publications.. I tried an online course in R a while back but didn't manage to complete it. Its inclusion in this course would be an added bonus for me. Starting off with numeric data will be fine. I very much look forward to getting started!

Lee Baker • 1 month ago

Thank you for your comments Edward - I'm looking forward to getting started too!

Kalu Ikenga • 1 month ago

Thank you for this piece. Learning by doing. Looking forward to your next video.

Lee Baker • 1 month ago

Thank you Kalu - I'm glad you're happy with the new plan

Kalu Ikenga • 1 month ago

I like the plan and will want to start with numerical data.

Lee Baker • 1 month ago

Numerical data first - gotcha!

Kalu Ikenga • 1 month ago

Yes,it is the right choice for me and the best way to take off. Thank you.

Lee Baker • 1 month ago

Great, Kalu - I look forward to started the building process!