Author Archive

Shahidul Mannan

Shahidul Mannan

Sr. Director, Big Data and Analytics, Dell IT

Keeping Garbage Out of the Lake: Data Governance in the Big Data World

As organizations unleash the power of the data lake by providing business broader access to more and more data, they are facing a growing IT dilemma—How to keep improperly governed or poor quality data from polluting the data lake.

While IT’s traditional approach to managing data governance and quality have been quite effective over the years, the magnitude of data in today’s data lake is much larger than traditional data warehouse levels. Traditional tools and tactics are being overwhelmed by Big Data in the lake.

There are, however, strategies that organizations can use to reshape data governance and quality standards in the Big Data world. While our tactics and tools are still evolving, I will share some of the efforts we are developing at EMC IT to keep our data lake clean.


The Power of Self-Service Big Data

From using analytics to predict how our storage arrays will perform in the field, to engineering product configurations to best meet customers’ future needs, EMC is just beginning to tap into the gold mine of intelligence waiting to be extracted from our new data lake.

In fact, we are currently working on dozens of business use cases that are projected to drive millions in revenue opportunities. And we are just scratching the surface. There’s a lot more data available, more to be harvested, and more analytics to be built out as data scientists and business users hit their stride in exploring a new era of data-driven innovation at EMC.

As I noted in my earlier blog ( The Analytics Journey Leading to the Business Data Lake), EMC IT embarked on creating a data lake to transition from traditional business intelligence to advance analytics more than two years ago. A key focus of this effort was to address the fact that data scientists and business users seeking to leverage our growing amount of data were stifled by the need for such projects to go through IT, which was a costly and slow process that discouraged innovation.

We now have the foundation and tools in place to use data and analytics to create sustainable, long-term competitive differentiation. To get here, we worked closely with EMC affiliate Pivotal Software, Inc. to mature together and leverage the multi-tenancy capabilities of their Big Data Suite.


The Analytics Journey Leading to the Business Data Lake

More than ever, businesses see their futures tied to their ability to harness the explosive growth in data. You may even be familiar with the Business Data Lake concept—a central repository of vast information which can be used across an enterprise to drive all business intelligence, advanced analytics and even, eventually, intelligent applications.

We, at EMC IT, are in the process of creating a Business Data Lake, and I will be sharing insights about our efforts in this blog. To start, let’s trace the vision that’s leading EMC IT and other businesses to the shores of this new data landmark.

bdl (more…)

Follow Dell EMC


Recent Tweets

How do you reach today’s hyper-connected customer? EMEA Marketing SVP @MargaretatDell highlights 5 key ways: about 1 hour ago
The @CRN 2017 products of the year are here! VMAX 950F makes the list for Enterprise Storage:… about 13 hours ago
What does “data monetization” really mean, how do you do it, and more importantly, who owns it? @schmarzo explains.… about 15 hours ago