Category Archives: Data analysis for daily jobs

(Data analysis: Recipe theory) What do you want to achieve? What do you need to learn?

It really depends on your goal, which knowledge/techniques to learn for you.

But still many people are trying to learn more about (academic) statistics and/or new technologies such as Big data, Data Science, Machine learning, Python, R or AI, while in most of the cases(especially for non-data scientists who want to utilize data analysis for their daily business) they do not contribute to achieving the goal.


Let’s look at the simply story:





Now let’s assume your ultimate goal is “making awesome dinner”.

Do you think that learning how to make a fire really contribute to the goal?


Learing how to make a fire is totally the same as learing academic statistics such as T-test, Chi-square, standard deviation etc.


They used to be effective and important to learn how to calculate standard deviation and what is its theoretical definition. But you do not have to really know everything any longer as the machines (software, tools, applications) have taken your place already.


The “Gas cooker” represents those tools which you can really on for making a fire.

They calculate those even faster and more accurately than you do by yourselves!, like a microwave automatically makes a meal hot even without knowing how they do it.

You do not have to know the mechanism of the microwave but you can achieve the goal.

What is the reason you still need to keep calculating manually or with a calculator?


More importantly, you cannot still achieve your ultimate goal(dinner) even if you use a gas cooker.

You have to know:

  – What material (data) to use for the meal?

  – How to cut the material (data)?

  – What tools (analytical tools) to use to cook?

  – How to cook (what arrangement is needed etc.)?


Those skills and knowledge are described in a recipe and this is absolutely what you need to learn to achieve the goal(making dinner) as other process is handled by technologies (gas cooker), not you.


This is why I am teaching “HOW TO COOK (Recipe)” to my clients and students, rather than how to operate analytical machines.


Make sure what you want to achieve and what do you need to learn.


I named this concept as “Recipe theory“.

#Data analysis







How to proceed with the problem-solving by data analysis

This slide is designed for my business students (International students at Yokohama National University) to understand the process of the problem-solving with data analysis in my class.


A simple marketing case was used in the slide, where you can verify the effectiveness of the D/M (Direct Mail) promotion.


In general you may need to identify what axis you select to breakdown the raw data as the first hypothesis.

In this case, there are only two axes (Direct mail delivery and gender) for the purpose of simplicity.



By illustrating the gap for each axis, you may identify which axis has the largest gap (in this case, D/M delivery has larger gap than that of gender axis.

Also you can find that the gap is made by female customer.


Now you are ready to develop the next hypothesis (Reasoning hypothesis) for the current result.

The questions you can ask yourself for hypothesis are such as:

– Why only female customers are affected by the D/M delivery?

– What information in the D/M affects those customers?


With those questions, you may need to collect data to verify the correlation between those factors and your findings (only female customers are affected by D/M delivery).


This is only a case to show the simple process of problem-solving for students.

However, it is essential to know the process BEFORE starting the process as you will be lost in the mid of the process without knowing where you are and what you need to do the next.



Data analysis basic exercise (comparison and graphing)

How to find the insights from data


Business data analysis class started again at Yokohama National University.

The class is designed mainly for the 1st and 2nd year international students to give business data analysis (problem-solving) skills.


We focused on some basic techniques and thinking process to process raw data to get some insights from the data.

This was one of the exercises we used in the class:

Data exercise

The students were required to make their own “conclusion (not just findings nor analysis results) with Excel.

In this case, the output (goal) parameter is “frequency of visit”, while the input (explanatory) parameters are (i) Gender and (ii) Direct Mail (D/M) received or not.


In order to get a first insight from the data, you may compare the data on the each parameter (in this case, gender and D/M received). Let’s look at examples of the outputs:

Data exercise 2

Data exercise 3

In order to simplify the original data, averages are used in the upper matrix.

The following bar charts show the comparing results with the two parameters (gender and D/M).


Which parameter is more critical to the goal? Look at the significance of the difference!


(D/M received) has more gap between “YES” and “NO”, rather than one between the genders.

So you may focus on the D/M effects more than gender difference. (Let’s call it “Key parameter“)


Then you may dig into the D/M effects to find more detail (by splitting by gender).

Data exercise 4


This is an example of a conclusion drawn from the chart above:


 ” Direct Mail affects the frequency of the visits, while only female customers have such effect. Therefore, my proposal is to keep sending D/M to the potential female customers to get more visitors”


Again, here are the points to learn from the exercise:

  • How to find the key parameter to dig into further?

          => Find the parameter with “larger gap” which indicates more critical factor to the goal/issue.


  •   What to do once you find the key parameter?

          =>Split the data with another parameter to dig into the “Key parameter”



Business statistics class for undergraduate students

This is a apart of my university program “Practical application of Business statistics”.

We started the semester by asking the students to

“Make your conclusion(s) by comparing any kinds of two or more data sets”.


The students were allowed to use any data, index or techniques to make a conclusion as I did not teach any analytical techniques nor thinking process yet.


After one week, the students prepared quite interesting works as follows (this is only a part of the all results):


Each student presented their own work within 5 min.  and we discussed how convincing and interesting it was in the class. Also discussed how you could have improved it in order to make the conclusion clearer and more convincing (this is important part).


My objective of the assignment was to make a “story” based on the face(data) , rather simply comparing the data, as the goal of business data analysis is NOT making an analysis result but finding useful insights and making a conclusion to convince your business partners.


I gave each student my feedbacks in the following viewpoints:


(A) Did you do an appropriate comparison?  — eg. number of automobiles in the US and Japan cannot be directly compared as they have different population.


(B) Did you make a conclusion (story) rather than showing calculation results? — Your audience wants to hear not results but “conclusion!”


(C) Is your conclusion based on the facts derived from data? —  Didn’t you put lots of your own assumptions to make the conclusion? (This is called “Logic jump”)


These three points are fundamental to business data literacy. I am not going to talk about “academic” statistics in my class but very practical business data literacy which is required in the AI (Artificial Intelligence) era.


I would be happy to provide my workshop class at your university class as well!


#statistics  #university #AI #Business application

Part 1: What does it mean “to utilize data analysis in practice”? What do you need?

■ Because “XX is not enough” that “I can not master the data”
As a professional in solving practical problems using data analysis, I conduct business training, practical support, lectures, etc. I also teach at business schools and universities.

Many people have studied data analysis and statistics until now. Nevertheless, I always hear the following concerns:
“Somehow uncomfortable with the results”
“There is data, but I do not realize that I am mastery of it”
“No results like what I expected”
“I can not use what I actually learned”

It is not necessarily caused by misuse of hard skills such as “analytical method”, “data science”, “statistics”, but the soft skills of people using them are not enough .
■ Then, what is practical “data analysis”?
Can you clearly and specifically answer to the question: “What do you need for data analysis?” And “What do you need to do for data analysis?”. What is the output of “data analysis”, what can you do with the output to achieve your goal?
In fact, many people tend to start to play around with data, without making those answers clear. In such case, you may not be able to draw out any useful information from data.

We often hear the word “data science”.
And then you may think “As long as you appropriately process the data, you can get valuable information out of the data automatically”
There are so many people who have such an impression .

Let’s try to sort out the categories and scope of the “data analysis” (see figure below).

The figure above shows a simplified representation of the world covered by the word “data analysis” in general . However, please be aware that my thinking “that the person in charge is using data (analysis) in practice” is part of it (“Data Analysis” category in the figure).

In general, the technical category handled by a data analysis expert (data scientist) is the top layer in the above figure. In order to become an expert in the upper category (Data Science), you need knowledge and understanding of academic mathematics, statistics, programming and latest technology.
However, it is quite unusual for a business company in general to hire an in-house data analysis specialist(s) all the time. This is because it is an area that you can outsource from time to time or leave it to a “machine”.

On the other hand, huge data can be easily gathered through internet, and there are overwhelmingly many cases where “non-data scientist” wants to quickly use it for his/her immediate goal. This is depicted in the middle (Data analysis) and the lower (Data Arrangement/Processing) categories.
Note that there is a huge gap in reality between the top layer and middle&lower layers(categories).
Never a business person who is not an expert on data analysis can do something even with sophisticated analytical tools, methods, statistical theory.

And, there is also a clear important reason to divide the middle and lower categories.

From small startups to super-large enterprises, there are many companies that has the trouble of “We have lots of data but not enough results with sufficient analysis”. Those companies end with the lower categories and never reach the middle without noticing the fact.
It is difficult to obtain “useful and convincing” analytical results only within the lower category. The goal should be in the middle category for utilizing the data analysis results for your business objective.

What absolutely necessary in any case is to identify the category where you use the data according to the ultimate goal you want to achieve (do you want to apply the latest technologies? Or to resolve the problem with data analysis or simply to visualize the data trend ? etc.) BEFORE starting collecting or processing any data!

In this article, we will cover the middle and lower categories in the chart. In other words, it will be a totally different story from this article to talk about the latest technology trends and programming for professional data scientists.
The common issue is that many organizations stops its data utilization in the lower category and have not reached up to the middle category (Data analysis). If you can expand the scope of your data utilization to the middle category, then you may get useful results required in your team/organization.
It is neither “statistical theory” nor “advanced analysis methods and tools” nor “the latest programming technology”.

No matter how fundamental or how data collection and processing methods are based on state-of-the-art technology, human skills (soft skills, “Data analysis” in the following chart) are required for the following process:

· What kind of data should be used
· How to interpret and utilize the output/result

As mentioned above, people and organizations who are not familiar with data are completely missing (not shortage) the soft skills.

■ Some misunderstanding on practical data analysis
Some people might think “I want to have the data analysis skills”.
“If I get to know even more analysis methods, additional and useful information can be obtained from the usual data.”
But, after some time you struggle with the data, you may understand that the idea is a just “illusion”.

Why is it “illusion”?
There are several reasons and backgrounds for this, but here I will tell you the most obvious (and easy to fall) background (see the figure below).


Before starting any actions using data, you should ask the fundamental question “How detailed does the data in your hands comprehensively represents the reality of the issue?”

Examples of data available to any companies are such as “sales results” and “customer satisfaction score” etc. Some data can be decomposed by product, by customer attribute, by region, by time, etc.
But no matter how much you are decomposing the data, you do not get information like “Why is your sales higher on Friday than on Wednesday?” Or “Why is the score in AreaA lower than that in AreaB?”
It is necessary to return to the reality that the data shows only a part of reality. Furthermore, the information that analysts can derive from that data should be also only a part of the overall information that the data has.

From time to time, I use such expressions in my lecture:
“There is no answer in the data”

Under the illusion of “There must be an answer I want to know in the data”, I ‘ve seen a lot of cases in which they struggle with the data endlessly, resulting in no practical results in the end. In this way, data analysis does not go well in practice.

So how can you resolve the issue?

Do not search for an answer. Rather, you make your own answer and verify it with data!
To do so, you need to begin by defining your issue and goal concretely and developing the necessary logic as a hypothesis.

(To be continued)

Business skill training programs

I have the following training programs for both business persons and academic (business) students:


  • Business problem solving : 

– Goal setting

– Process design

– Issue breakdown

– Cause identification

– Countermeasure making


  • Basic data analysis for business application

– How to apply basic statistics to your business problem solving

– How to apply data/statistics to your business proposals

– Goal setting/ hypothesis approach etc.

– Difference between “data summary” and “data analysis”

– Four dimensions to catch “data information”

– Correlation and regression analysis etc.


  • Logical thinking

– What is “logical”?

– How to avoid illogical thinking

– Necessary process to make a logical conclusion


Here are some options:

  1) One-day training (mainly for business persons)

  2) One week intensive course (mainly for academic students)

  3) A few hour-seminar 


All the programs are designed so that you can apply the skills for practical use in business.


I have over 50 corporate clients in Japan (both Japanese and global companies) and organization in the public sector.



I would be happy to discuss opportunity to deliver my program(s) in English outside Japan.

What is your goal of “Data Analysis”?

Many discussions are going on about the “data analysis” and/or ” data application to business fields”.


However, their definitions and scopes are not necessarily identical, which causes lots of confusions and misunderstandings.


One of the biggest gaps which are NOT recognized in general would be the difference between “Data science” and “(practical) Data analysis”.


In order to clarify the overall picture of the “Data Analysis” world, you can see the following MAP:

Data analysis 2










Before touching the actual data itself, you need to identify which category are you addressing when you say “Data Analysis”.

It depends on your ultimate goal to achieve with the data.


If you are NOT a data analyst nor a data analytics expert and want to simply apply some basic data analysis for your business problem solving, then you do NOT need to learn “Machine learning” for instance.


What you need would be how to design the problem-solving process (Analysis Design) and how to apply the basic data analysis (mostly by Excel) for your issue(s).


The biggest issue I have observed at my client’s business offices is the fact that many people do not even”Data analysis” but do only “Data arrangement”.

This is the reality, which is far away from the “Data Science” world.


I am helping those “usual” clients to improve their business problem solving skills using “Data Analysis” techniques (not Data Science), as it is what many people need and contribute to thier immediate issues.


I hope this may clarify confusions and help you identify what you should learn and/or apply.