Tag Archives: analytics

A very interesting student work from my business statistics class

This is a very interesting presentation as the final exam of my “Business statistics” class at Yokohama National University in Japan.


My classes are all for international students and this is work of a student from Vietnam.

The students learned how to set a practical goal and how to effectively use some analytical techniques to support the conclusion(s).

The student tried to find out the difference of the students from two different countries, Vietnam and Japanese in terms of GPA and objectives to learn at university.


While the sample size is small and the conclusions are not necessarily surprising, the approach and analysis itself was quite interesting.


My class is not just to teach some academic analytical techniques but to teach how to apply those techniques to meet practical goals.

I would be happy to give my lecture anywhere in the world.


Hana 1Hana 2Hana 3Hana 4Hana 5

What you need to consider when drilling down the data.(vol.1)

People drill down data by some axis.

For example, sales amount data can be broken down (drilled down) by area, by branch or by product etc.

It is technically possible to drill down by any axis but you may find it not practical at all when you try to apply the results into your business. Why.

From my experience and survey, I found three major points you should consider when you select the axis for drilling down the data.

One of the three is “Impact to the goal”.

You need to make sure the consequence of using the axis will effectively contribute to or affect the goal finally.

You can break the data by customer location (area), for instance. But if the business is internet shop, then location does not matter at all. I know this is a too simple example, but people tend to skip this exercise when they simply use some axis to break the data.

The question you should ask would be “Is the axis really a key driver to the goal?”.

◾From my lecture note at university #11(Modeling with statistics)

We started inferential statistics today but only for this week and next week.

Many people hate the statistical testing staff as it is very confusing and sometimes they feel it not practical. Therefore, I focused only on the most practical one which is testing a gap between two averages using T-test.

It was really hard to explain the concept of the statistical testing (population and samples). At the start, I used an example of coin-tossing to test the 50%:50% chance.

Secondly, I reminded the students that the data we had been using was just sampled data not from the population, which you have to be aware of.



I always try to explain a practical way to use the statistics, rather than just a theory. The step chart below was used as an introduction.


This was the hardest part to explain. I showed two approaches, critical value approach (t-value) and probability approach(p-value).


Finally, as always and as my policy, I had the students to solve some practical problems as follows:


At the next session, we will continue to learn more about the statistical testing.

From my lecture note at university #9(Modeling with statistics)

Today’s topic was (simple linear) regression analysis.

As usual, the main focus of my class is not to learn an academic theory but to be able to apply the tools for a practical business issue.


After the review of the correlation analysis, I started talking about what is the “regression”.


Using Microsoft Excel, students solved a sample practice and other couple of the questions.


Through those exercises, they learned how to use the equation gained by the regression analysis. This is not only for future predictions but also for planning and optimization etc.

I spent lots of time for them to understand the slope of the equation means and how they can apply the concept to the problem-solving.


From my lecture note at university #5 (Modeling with statistics)

Today’s focus was CV(Coefficient of Variation).

It is used to measure “relative” variance and is indispensable when you compare the variance among the data sets with different averages.


Impact of standard deviation of $2,000 is not the same for a large store with average monthly sales of $500,000 and a small shop with sverage monthly sales of $5,000.

In such case, you have to cancel out the difference of the average (data scale), by dividing the standard deviation by the average, which makes CV.


A question in the class was which index you would like to invest your money and why. (Nikkei 225, NY Dow, and JPY/USD FOREX)

I was expecting the students to look at the recent trend in terms of value and risk among the index.

You may compare the value trend with monthly average and the risk with the CV.

In conclusion, only FOREX had upward trend and with lowest risk (CV), compared with other two.


I hope they enjoyed the team discussions in the class.

See you all next week!

From my lecture note at university #3 (Modeling with statistics)

Today (the 3rd session), I started the class with re-cap of the last session on hypothesis approach.

After the team study of developing the logic tree, I asked a question of how to prioritize the hypothesis logically.

I showed “Payoff matrix” to pur priority.


Today’s main agenda was “Average”.

Using Average is easy but highly depending on only average may be misleading without knowing the distribution of the original data.



After the three sessions, I learned a lot and needed to adjust my class to fit the students:

(i) Undergraduate student may not necessarily know the basic staff (Profit structure ets.) as a business person commonly has.

(ii) They are not familiar with basic operations of Microsoft applications(Excel).

It is very challenging to me but extremly exciting!!

How could you explain what “Average” is to a child?

I always throw a question in my business statistic class:

” How could you explain about “Average” when you are asked by a kid?”

As of today, I have not encountered even a single person who was able to answer properly.

Some people said ” total devided by the number of data”.

But it is just how to calculate but not what average is.


While average is one of the most basic tools, it is not easy to understand.


Average is the leveled size of data, regardless of how many data you have.

You can assume each single data as a candel.

If those candles melt, the height of the melted chunk of candle is the average.


A dictionary says: Average is a statistic describing the location of a distribution.

More confusing for anyone………………….