How to make data scientists shine

The effort to take advantage of emergent new business innovations, of advances in digitization, analytics, artificial intelligence, machine learning, internet of things or robotics, is leading to an increasing demand for people with related skills.

The Challenges

Being a data scientist may be considered as the sexiest job within the data related jobs, but it has its challenges, specially when it comes to demonstrate the value created by their work.

In this article, let us look at some of those challenges, and how they can be overcome when organizations take on a systematic approach on how to manage their data.

Lack of clear question

This is often a communication problem, turning a business problem into a technical problem, when there is a gap in the language and concepts used by the business stakeholders and the data scientists. However, the causes run deeper, and can be related also with a lack of data literacy on the business side and business literacy on the data side, and with the lack of organization wide business concepts that can be clearly mapped into data.

All this leads to having data scientists to jump to work on data and tools without getting a clear understanding of the business requirement.

Inaccessible data

Almost every organization has overflowing data siloed across a range of platforms, software, formats, handling this data, accessing, finding, and consolidating the correct data is also a major challenge,

Finding the right data, scattered across multiple sources, with different and unclear business concepts, various rules and different levels of quality is usually dependent on manual entry of data and time-consuming data searching, leading to errors, repetitions, and redundancy.

Dirty data

Data preparation is often the biggest of the challenges, it is common the hear that 80% of a data scientist time is spent on cleansing and classifying data, even being accepted as part of the job, when in fact the quality of the data is a responsibility of everybody in the organization.

Adding to this, the cleansing is often performed autonomously, relying on individual judgment on what are the quality rules and on how to make data compliant with them.

Insights not used in management decisions

Strangely as it seems, considering the huge investments being made to create data-based insights to feed the corporate processes, those insights are frequently ignored, with top managers centering their decisions on their experience or “gut feelings”.

Trust on data is essential, and it is essential throughout all the data life cycle. It’s not possible to trust data-derived insights when there is no trust on the data itself.

Making Data Scientists Shine

In most organizations, data scientists are placed in an awkward situation, being asked to produce valuable insights based on the organization’s data. And although being given the technical tools and conditions to perform this task, they are denied the fundamental base for their work. Meaningful, reliable, accurate, quality data.

At the root of this problem, we have an asset that is not being managed.

Organizations need to have a clear stand on managing its most important asset – data.

Creating the right conditions for these data insights to exist and be trusted, hence adding true value to the organization it is a complex problem and must be approached in an “organic” way within the organization, allowing a data culture to grow naturally, founded on results, built upon success stories, that the business stakeholders and decision makers can relate to.

Knowing that any successful data strategy is necessarily a business strategy, that data’s purpose is to create business value, so any data strategy must be oriented towards the organization's strategic priorities and key business objectives.

The key to achieve these business objectives and to align the data strategy and any data initiatives with these objectives is to make business the driver for the data strategy.

Creating a set of initiatives that are driven and oriented by the business stakeholders and units, working on use cases that are grounded on solid business cases, allowing business to identify their needs and from there, working with business to identify the data that is necessary, the rules that govern that data, the business concepts behind the data, the quality standards and rules that should be met, will remove the overload that is being assumed by data scientists and build the necessary trust that will allow the insights produced to turn into actionable insights and add value to the corporate decision processes.

The need to address governance and quality cannot be an impediment for data initiatives, so move forward focusing on delivering the intended results and assuring that data quality management and governance have a major and ongoing role along the way.

Use an iterative, agile approach. Make continuous adjustments and keep going. This will give your data management initiatives the necessary momentum to keep going. Organizing these initiatives into smaller sprints will make the changes more manageable.

Being able to refocus data initiatives by starting small is the best approach, the same is valid for data governance and quality. By understanding the way in which to provide a framework that integrates into existing environments, using existing standards and allowing organizations to do something tactical with a high ROI.

Focusing first on the tactical deployment, organizations can then build on the first success, allowing the data strategy to evolve naturally growing on these small success stories.