Whether companies refer to results, outcomes, ROI, or case studies, Big Data and data science are finally moving beyond the hype and proving to deliver dividends over time. Several new Big Data technologies and predictive tools have been launched to meet the growing demand within business and technology groups to harness the constant growth of both structured and unstructured data within and outside of the enterprise. But such technologies and tools won’t be effective unless you define the problem to be addressed.
Most data science initiatives start with a proof of concept (PoC) or in some cases with a proof of value (PoV) if the foundational concept is clearly established. Developing a pipeline of PoC’s can be extremely helpful through working sessions with data scientists, business subject matter experts (SME’s), data experts, and leaders. Following this, prioritize PoCs by stack-ranking each of them based on business value and ease of implementation which factors in availability of data, granularity, and quality.
As organizations unleash the power of the data lake by providing business broader access to more and more data, they are facing a growing IT dilemma—How to keep improperly governed or poor quality data from polluting the data lake.
While IT’s traditional approach to managing data governance and quality have been quite effective over the years, the magnitude of data in today’s data lake is much larger than traditional data warehouse levels. Traditional tools and tactics are being overwhelmed by Big Data in the lake.
There are, however, strategies that organizations can use to reshape data governance and quality standards in the Big Data world. While our tactics and tools are still evolving, I will share some of the efforts we are developing at EMC IT to keep our data lake clean.
Successful companies like Ford and Netflix have deployed more than just innovative consumer service models; they also use cutting-edge cloud native IT architecture to quickly adapt to changing market demands.
Cloud Native is an architectural principal that helps IT developers write applications in a way that that maximizes the use of cloud environments where tight coupling of applications to underlying infrastructure is eliminated. Combined with the right Platform as a Service (PaaS) capabilities, this approach reduces your organization’s time to market, increases responsiveness to customer feedback and cuts operating costs—all the things today’s innovative companies thrive on.
If your organization is struggling with how to keep your enterprise data secure in the cloud, you aren’t alone. The fact is, the modern data center poses some fairly new security challenges and there is no rule book on how to meet them. Even in security, we are learning as we go.
From using analytics to predict how our storage arrays will perform in the field, to engineering product configurations to best meet customers’ future needs, EMC is just beginning to tap into the gold mine of intelligence waiting to be extracted from our new data lake.
In fact, we are currently working on dozens of business use cases that are projected to drive millions in revenue opportunities. And we are just scratching the surface. There’s a lot more data available, more to be harvested, and more analytics to be built out as data scientists and business users hit their stride in exploring a new era of data-driven innovation at EMC.
As I noted in my earlier blog ( The Analytics Journey Leading to the Business Data Lake), EMC IT embarked on creating a data lake to transition from traditional business intelligence to advance analytics more than two years ago. A key focus of this effort was to address the fact that data scientists and business users seeking to leverage our growing amount of data were stifled by the need for such projects to go through IT, which was a costly and slow process that discouraged innovation.
We now have the foundation and tools in place to use data and analytics to create sustainable, long-term competitive differentiation. To get here, we worked closely with EMC affiliate Pivotal Software, Inc. to mature together and leverage the multi-tenancy capabilities of their Big Data Suite.
With the expanding volume of information in the digital universe and the increasing number of disk drives required to store that information, disk drive reliability prediction is imperative for EMC and EMC customers.
Figure 1- An illustration of the information expansion in the last years and expected growth
Disk drive reliability analysis, which is a general term for the monitoring and “learning” process of disk drive prior-to-failure patterns, is a highly explored domain both in academia and in the industry. The Holy Grail for any data storage company is to be able to accurately predict drive failures based on measurable performance metrics.
Naturally, improving the logistics of drive replacements is worth big money for the business. In addition, predicting that a drive will fail long enough in advance can facilitate product maintenance, operation and reliability, dramatically improving Total Customer Experience (TCE). In the last few months, EMC’s Data Science as a Service (DSaaS) team has been developing a solution capable of predicting the imminent failures of specific drives installed at customer sites.
For most IT organizations, deploying a successful enterprise hybrid cloud is the next step to bringing together all the efficiencies and capabilities they’ve achieved through infrastructure virtualization, standardization and consolidation, and the ongoing evolution of software automation to deliver self-service capabilities.
At EMC IT, we are in the midst of this hybrid cloud transformation, beginning with an internal hybrid cloud platform, called Atlas, which has been providing agile, on-demand infrastructure (IaaS) to our IT users over the past year.
While our enterprise hybrid cloud is continuing to evolve and grow, I wanted to share some insights with you on our project goals, as well as technology and business choices for this important leg of our IT transformation journey. (For more details check out our white paper and reference architecture, EMC IT Enterprise Hybrid Cloud.
EMC IT is innovating and developing new IT solutions that not only meet our internal customers’ growing data and IT demands but also help us drive improved space utilization and energy efficiencies in our modern data centers.
For example, in our regional data center in Cork, Ireland we used “hot aisle containment” technology to decrease machine energy consumption by 24 percent. In our Hopkinton Data Center, we increased space efficiency and reduced power consumption to extend the facility’s life by five years. And leveraging IT’s own business analytics tools, we were able to apply predictive and deeper analytics into application and device power usage—to drive further efficiencies.
Read more about our Efficient Data Centers and how they further EMC’s commitment to sustainability in EMC’s 2015 Sustainability Report.
From taking charge of healthcare choices to customizing product purchases, today’s consumers are increasingly using self-service, social, and mobile digital capabilities. EMC’s new MyService360 now brings that same personalized, proactive service to our Online Support customers.
Powered by EMC data lake solution, MyService360 (launched at EMC World 2016 on May 2) gives EMC Support customers easier and faster access to real-time information at their fingertips. Using its easy-to-read visual and powerful analytics, customers can view analysis of code levels, health, and risk scoring on their installed EMC products, service activity views by site, incident management, and more.
When it comes to today’s IT, it really isn’t a matter of whether your IT operation should pursue a DevOps strategy and operating model to deliver software in the cloud. The question is how best to transition to this critical new approach. Similar to making the shift to IT as a Service, adopting DevOps is a must do in order for IT to survive and be competitive.
DevOps is a big buzzword right now, and it can mean different things to different people. At the end of the day, however, it is really about improving cooperation between IT teams that are traditionally siloed and delivering business value quicker and cheaper.
Like so many things in high tech, DevOps represents a circular evolution in IT. We spent decades siloing IT functions, focusing on segmented competencies in the name of efficiency, and now, we realize that shedding those siloes and bureaucracy, collaborating across functions, and using automation to enable individuals to more nimbly create software is the best way to deliver capabilities in the cloud. (more…)
The opinions and interests expressed on Dell EMC employee blogs are the employees' own and do not necessarily represent Dell EMC's positions, strategies or views. Dell EMC makes no representation or warranties about employee blogs or the accuracy or reliability of such blogs. When you access employee blogs, even though they may contain the Dell EMC logo and content regarding Dell EMC products and services, employee blogs are independent of Dell EMC and Dell EMC does not control their content or operation. In addition, a link to a blog does not mean that EMC endorses that blog or has responsibility for its content or use.