Blog

Data Visualization and Storytelling: the essential skill for a Data Scientist

Posted on February 1, 2021

Data Visualization and Storytelling

Data science is one of the most potential interdisciplinary domains with an explosive expansion of its boundaries in every walk of life and Industries. It applies scientific methods and algorithms to extract knowledge and insights from data which could serve an input for decision making and further course of action.

Referring to the possibilities of data, quite a lot of inspirational references like data as the new oil of the 21st century, Data Scientist: the sexiest job of the 21st century, Analytics the combustion engine for cracking data etc are frequently emphasized in many situations. These references are true, a report from IBM rightly projects that 90% of the data in the world is generated in the last two years. Also, it is relevant that only less than 0.5% of the available data is utilized so far. Positively, a huge reservoir of data with a heavy beck of new streaming data is remaining unexplored. This is the possibility of data science, here it is endowed with endless prospects of cracking the unexplored data to actionable insights capable of providing solutions to any complex problems irrespective of domain and to foster continual developments.

In this context, it would be worth drilling deep into a few sets of data science skills and competencies.

Undoubtedly, Data Visualization is the most interesting part in data analytics. It is a process of presentation of data by means of various charts and maps or mapping the data points to a meaningful shape with projects out the information it contains. Thus, visualization is termed as a process of transformation of huge, flat and boring data to amazing visuals so that it starts speaking to you about the hidden facts and concealed information in the data. 

For developing Visualization, the most essential skill of a data scientist is to relate the visuals to the client requirement or connect it to the quest that the investigator tries to solve. So, it is important to restrict the visuals to what really matters. While selecting visuals for analysis care should be taken that the visuals are appropriate to the data type and to the purpose for which it is used. From an Industrial and Business perspective data visualization could be effectively used in a variety of purposes like preparation of annual reports, dashboards, sales analysis, marketing analysis and for any information that needs to be interpreted immediately.

There are many visualization tools available in the market. For selection of visualization tools, obviously the first priority will be the ease to use. Clearly, Tableau is considered as a grandmaster of visualization with a large user base in Business Intelligence. The most significant feature in tableau is that it has a hundred of data import options and mapping capability. The public version of tableau is available, but it does not permit to keep the data analysis private. Among the other powerful tools include Power BI, Infogram, Chart Blocks, Datawrapper, Google charts, Excel, SPSS etc. Referring to the programming language R and Python with powerful libraries like pandas, numpy, matplotlib are good options for Data Analyst with programming skills.

Data Storytelling is another essential skill needed by a data scientist. It takes data visualization to the next level. The clients or group to which you are communicating your findings and models will not be technically literate to accommodate the facets in the presentation or reports that are technically drafted. Presenting the findings in the form of a visual story will make the presentation and report amazingly effective. This approach makes the client involved in the task and even elicit insights from the participants from their own experience and involvement so that the final part of problem solving becomes easy.

An effective story may include the context of the problem, challenges and conflicts, scenario of cause and effects, resolution and solutions, recommendations and actions. Fairness of data is the most important aspect in the data analytic process, for your data inputs make all the differences. Maybe a piece of data may change all the human heuristics.

A small instance on how a storytelling approach turned the scenario of a workspace is illustrated for your understanding. The context is related to the financial downturn of a small industry where the management is finding difficulty to disburse the salary of the employee. The financial manager holds a meeting with the entire staff, and he presents the situation in the format of a story. The manager has used only 5 figures in his story. The first figure is about the outstanding figure at the beginning of the month. The second figure is about the revenue during the month. The third figure is about the total expenditure incurred during the month. The fourth figure is about the total amount the company holds as liquid cash. The fifth one is about the total amount needed to pay salary. It was found that the amount needed to pay salary is three times than the cash in hand. Then he made the staff to involve in the scenario to come up with solutions and recommendations.

Obviously, the first trivial solution that came up is to cut down that month’s salary to 1/3 as the company policy is against going for loans. Interestingly, the staff came up with alternate solutions that all the members are not equally in need of salary as there are many youngsters than those holding families whose income is the only source for their family. Hence the staff has agreed with the decision to pay full salary to 1/3 of the staff who are needy, and the others could incrementally claim their unpaid salary within a period. Also, they have discussed some action plans to boost the income in the coming periods that such a situation could be avoided in future. This is the magic of storytelling.

A functional knowledge in SQL is also highly recommended for a data enabled profile. This knowledge and skills will be vital to extract data from databases, manipulate data and create data pipelines which is an essential stage in the data lifecycle. The proficiency in writing efficient and scalable queries is considered as a crucial skill for companies working with big data volumes.

Positively, it is very essential for a professional who anchors his career in data science should master the skills of data visualization and storytelling to excel at its best in this domain.

Leave a Reply

Your email address will not be published.