Tag Archives: data analysis

(Data analysis: Recipe theory) What do you want to achieve? What do you need to learn?

It really depends on your goal, which knowledge/techniques to learn for you.

But still many people are trying to learn more about (academic) statistics and/or new technologies such as Big data, Data Science, Machine learning, Python, R or AI, while in most of the cases(especially for non-data scientists who want to utilize data analysis for their daily business) they do not contribute to achieving the goal.

 

Let’s look at the simply story:

ガスコンロ

 

 

 

Now let’s assume your ultimate goal is “making awesome dinner”.

Do you think that learning how to make a fire really contribute to the goal?

 

Learing how to make a fire is totally the same as learing academic statistics such as T-test, Chi-square, standard deviation etc.

 

They used to be effective and important to learn how to calculate standard deviation and what is its theoretical definition. But you do not have to really know everything any longer as the machines (software, tools, applications) have taken your place already.

 

The “Gas cooker” represents those tools which you can really on for making a fire.

They calculate those even faster and more accurately than you do by yourselves!, like a microwave automatically makes a meal hot even without knowing how they do it.

You do not have to know the mechanism of the microwave but you can achieve the goal.

What is the reason you still need to keep calculating manually or with a calculator?

 

More importantly, you cannot still achieve your ultimate goal(dinner) even if you use a gas cooker.

You have to know:

  – What material (data) to use for the meal?

  – How to cut the material (data)?

  – What tools (analytical tools) to use to cook?

  – How to cook (what arrangement is needed etc.)?

 

Those skills and knowledge are described in a recipe and this is absolutely what you need to learn to achieve the goal(making dinner) as other process is handled by technologies (gas cooker), not you.

 

This is why I am teaching “HOW TO COOK (Recipe)” to my clients and students, rather than how to operate analytical machines.

 

Make sure what you want to achieve and what do you need to learn.

 

I named this concept as “Recipe theory“.

#Data analysis

 

 

 

 

 

Advertisements

How to proceed with the problem-solving by data analysis

This slide is designed for my business students (International students at Yokohama National University) to understand the process of the problem-solving with data analysis in my class.

 

A simple marketing case was used in the slide, where you can verify the effectiveness of the D/M (Direct Mail) promotion.

problem-solving

In general you may need to identify what axis you select to breakdown the raw data as the first hypothesis.

In this case, there are only two axes (Direct mail delivery and gender) for the purpose of simplicity.

 

 

By illustrating the gap for each axis, you may identify which axis has the largest gap (in this case, D/M delivery has larger gap than that of gender axis.

Also you can find that the gap is made by female customer.

 

Now you are ready to develop the next hypothesis (Reasoning hypothesis) for the current result.

The questions you can ask yourself for hypothesis are such as:

– Why only female customers are affected by the D/M delivery?

– What information in the D/M affects those customers?

 

With those questions, you may need to collect data to verify the correlation between those factors and your findings (only female customers are affected by D/M delivery).

 

This is only a case to show the simple process of problem-solving for students.

However, it is essential to know the process BEFORE starting the process as you will be lost in the mid of the process without knowing where you are and what you need to do the next.

 

 

Data analysis basic exercise (comparison and graphing)

How to find the insights from data

 

Business data analysis class started again at Yokohama National University.

The class is designed mainly for the 1st and 2nd year international students to give business data analysis (problem-solving) skills.

 

We focused on some basic techniques and thinking process to process raw data to get some insights from the data.

This was one of the exercises we used in the class:

Data exercise

The students were required to make their own “conclusion (not just findings nor analysis results) with Excel.

In this case, the output (goal) parameter is “frequency of visit”, while the input (explanatory) parameters are (i) Gender and (ii) Direct Mail (D/M) received or not.

 

In order to get a first insight from the data, you may compare the data on the each parameter (in this case, gender and D/M received). Let’s look at examples of the outputs:

Data exercise 2

Data exercise 3

In order to simplify the original data, averages are used in the upper matrix.

The following bar charts show the comparing results with the two parameters (gender and D/M).

 

Which parameter is more critical to the goal? Look at the significance of the difference!

 

(D/M received) has more gap between “YES” and “NO”, rather than one between the genders.

So you may focus on the D/M effects more than gender difference. (Let’s call it “Key parameter“)

 

Then you may dig into the D/M effects to find more detail (by splitting by gender).

Data exercise 4

 

This is an example of a conclusion drawn from the chart above:

 

 ” Direct Mail affects the frequency of the visits, while only female customers have such effect. Therefore, my proposal is to keep sending D/M to the potential female customers to get more visitors”

 

Again, here are the points to learn from the exercise:

  • How to find the key parameter to dig into further?

          => Find the parameter with “larger gap” which indicates more critical factor to the goal/issue.

 

  •   What to do once you find the key parameter?

          =>Split the data with another parameter to dig into the “Key parameter”

 

 

Part 1: What does it mean “to utilize data analysis in practice”? What do you need?

■ Because “XX is not enough” that “I can not master the data”
As a professional in solving practical problems using data analysis, I conduct business training, practical support, lectures, etc. I also teach at business schools and universities.

Many people have studied data analysis and statistics until now. Nevertheless, I always hear the following concerns:
“Somehow uncomfortable with the results”
“There is data, but I do not realize that I am mastery of it”
“No results like what I expected”
“I can not use what I actually learned”
.

It is not necessarily caused by misuse of hard skills such as “analytical method”, “data science”, “statistics”, but the soft skills of people using them are not enough .
■ Then, what is practical “data analysis”?
Can you clearly and specifically answer to the question: “What do you need for data analysis?” And “What do you need to do for data analysis?”. What is the output of “data analysis”, what can you do with the output to achieve your goal?
In fact, many people tend to start to play around with data, without making those answers clear. In such case, you may not be able to draw out any useful information from data.

We often hear the word “data science”.
And then you may think “As long as you appropriately process the data, you can get valuable information out of the data automatically”
There are so many people who have such an impression .

Let’s try to sort out the categories and scope of the “data analysis” (see figure below).

Chart1
The figure above shows a simplified representation of the world covered by the word “data analysis” in general . However, please be aware that my thinking “that the person in charge is using data (analysis) in practice” is part of it (“Data Analysis” category in the figure).

In general, the technical category handled by a data analysis expert (data scientist) is the top layer in the above figure. In order to become an expert in the upper category (Data Science), you need knowledge and understanding of academic mathematics, statistics, programming and latest technology.
However, it is quite unusual for a business company in general to hire an in-house data analysis specialist(s) all the time. This is because it is an area that you can outsource from time to time or leave it to a “machine”.

On the other hand, huge data can be easily gathered through internet, and there are overwhelmingly many cases where “non-data scientist” wants to quickly use it for his/her immediate goal. This is depicted in the middle (Data analysis) and the lower (Data Arrangement/Processing) categories.
Note that there is a huge gap in reality between the top layer and middle&lower layers(categories).
Never a business person who is not an expert on data analysis can do something even with sophisticated analytical tools, methods, statistical theory.

And, there is also a clear important reason to divide the middle and lower categories.

From small startups to super-large enterprises, there are many companies that has the trouble of “We have lots of data but not enough results with sufficient analysis”. Those companies end with the lower categories and never reach the middle without noticing the fact.
It is difficult to obtain “useful and convincing” analytical results only within the lower category. The goal should be in the middle category for utilizing the data analysis results for your business objective.

What absolutely necessary in any case is to identify the category where you use the data according to the ultimate goal you want to achieve (do you want to apply the latest technologies? Or to resolve the problem with data analysis or simply to visualize the data trend ? etc.) BEFORE starting collecting or processing any data!

In this article, we will cover the middle and lower categories in the chart. In other words, it will be a totally different story from this article to talk about the latest technology trends and programming for professional data scientists.
The common issue is that many organizations stops its data utilization in the lower category and have not reached up to the middle category (Data analysis). If you can expand the scope of your data utilization to the middle category, then you may get useful results required in your team/organization.
It is neither “statistical theory” nor “advanced analysis methods and tools” nor “the latest programming technology”.

No matter how fundamental or how data collection and processing methods are based on state-of-the-art technology, human skills (soft skills, “Data analysis” in the following chart) are required for the following process:

· What kind of data should be used
· How to interpret and utilize the output/result

As mentioned above, people and organizations who are not familiar with data are completely missing (not shortage) the soft skills.

Chart2
■ Some misunderstanding on practical data analysis
Some people might think “I want to have the data analysis skills”.
“If I get to know even more analysis methods, additional and useful information can be obtained from the usual data.”
But, after some time you struggle with the data, you may understand that the idea is a just “illusion”.

Why is it “illusion”?
There are several reasons and backgrounds for this, but here I will tell you the most obvious (and easy to fall) background (see the figure below).

Chart3

Before starting any actions using data, you should ask the fundamental question “How detailed does the data in your hands comprehensively represents the reality of the issue?”

Examples of data available to any companies are such as “sales results” and “customer satisfaction score” etc. Some data can be decomposed by product, by customer attribute, by region, by time, etc.
But no matter how much you are decomposing the data, you do not get information like “Why is your sales higher on Friday than on Wednesday?” Or “Why is the score in AreaA lower than that in AreaB?”
It is necessary to return to the reality that the data shows only a part of reality. Furthermore, the information that analysts can derive from that data should be also only a part of the overall information that the data has.

From time to time, I use such expressions in my lecture:
“There is no answer in the data”

Under the illusion of “There must be an answer I want to know in the data”, I ‘ve seen a lot of cases in which they struggle with the data endlessly, resulting in no practical results in the end. In this way, data analysis does not go well in practice.

So how can you resolve the issue?

Do not search for an answer. Rather, you make your own answer and verify it with data!
To do so, you need to begin by defining your issue and goal concretely and developing the necessary logic as a hypothesis.

(To be continued)

What is your goal of “Data Analysis”?

Many discussions are going on about the “data analysis” and/or ” data application to business fields”.

 

However, their definitions and scopes are not necessarily identical, which causes lots of confusions and misunderstandings.

 

One of the biggest gaps which are NOT recognized in general would be the difference between “Data science” and “(practical) Data analysis”.

 

In order to clarify the overall picture of the “Data Analysis” world, you can see the following MAP:

Data analysis 2

 

 

 

 

 

 

 

 

 

Before touching the actual data itself, you need to identify which category are you addressing when you say “Data Analysis”.

It depends on your ultimate goal to achieve with the data.

 

If you are NOT a data analyst nor a data analytics expert and want to simply apply some basic data analysis for your business problem solving, then you do NOT need to learn “Machine learning” for instance.

 

What you need would be how to design the problem-solving process (Analysis Design) and how to apply the basic data analysis (mostly by Excel) for your issue(s).

 

The biggest issue I have observed at my client’s business offices is the fact that many people do not even”Data analysis” but do only “Data arrangement”.

This is the reality, which is far away from the “Data Science” world.

 

I am helping those “usual” clients to improve their business problem solving skills using “Data Analysis” techniques (not Data Science), as it is what many people need and contribute to thier immediate issues.

 

I hope this may clarify confusions and help you identify what you should learn and/or apply.

 

 

DATA ANALYSIS DESIGN APPROACH

Here is a part of my presentation in the “data analysis” seminar.

I always emphasis the significance of the right approach to a problem when you apply “data analysis” for solving it.

 

I found that many people struggled to find effective solutions based on the data especially when they started with analyzing the data without properly defining/formulating the problems and making hypothesis. Even if you find something from data, it might not be effective enough to solve the fundamental problem you have.

 

I call the necessary part in the problem-solving process as “ANALYSIS DESIGN”. My training programs all focus on the skill sets to design the analysis (i.e. problem-definition/formulation and hypothesis making) so that they can find a “right” solution.

 

Data analysis

This is something you should learn before learning the methodology of data analysis and/or difficult theory of statistics if you want to obtain the analytical skills to apply for business problem solving.

Also it is an important skill in the AI(Artificial Intelligence) era for many business persons as analysis itself can be already done by machines.

 

I have programs to train business persons and university students on this subject.

 

http://data-story.net/english/

My programs in English

I have some business skills training programs in English, such as “Business data analysis”, “Logical thinking for business problem-solving” etc.

I also have some class programs for undergraduate/graduate students at university.

Please find more details at the following link: http://data-story.net/english/

img_value_top_english

The skills you can learn in the programs are “fundamental”,  “practical(not purely academic) ” and “effective” in any type of business.

It is also possoble to customize the contents of the programs according to your requirements and conditions. (eg. 1 hour speech to one week intensive course)

 

Please feel free to ask any questions on my programs.