The data lake is proving to be a crucial tool as EMC IT strives to partner more closely with the business clients it serves to help them get the most out of enterprise Big Data. For example, EMC IT is offering a smart data base that lets business users across the company leverage a uniform customer profile for more efficient and effective sales analytics.
Created in collaboration with EMC Global Services, the CAP (Customer Account Profile) is based on information collected and aggregated from multiple sources to provide a holistic customer view—a single version of the truth, if you will, about our customers.
CAP is managed by IT and is one of the enterprise data sets made available via the data lake to business clients seeking to analyze customer trends, opportunities and insights.
More than ever, businesses see their futures tied to their ability to harness the explosive growth in data. You may even be familiar with the Business Data Lake concept—a central repository of vast information which can be used across an enterprise to drive all business intelligence, advanced analytics and even, eventually, intelligent applications.
We, at EMC IT, are in the process of creating a Business Data Lake, and I will be sharing insights about our efforts in this blog. To start, let’s trace the vision that’s leading EMC IT and other businesses to the shores of this new data landmark.
Viktor Mayer-Schonberger and Kenneth Cukier, authors of Big Data: A Revolution That Will Transform How We Live, Work and Think, wrote, “If big data teaches us anything, it is that just acting better, making improvements – without deeper understanding – is often good enough.”
EMC IT not only recognizes the hidden value of Big Data, but also strives to generate better outcomes. So, we at EMC IT can act better and faster to improve our customers’ experience.
In his November 2013article, Dan Inbar from EMC’s IT organization eloquently presented what IT has been doing to improve the operations of our Exchange email environment. PAITO (Predictive Analytics for IT Operations) is our Big Data analytics solution for outage prediction that allows our IT operations team to collect, analyze, store, and leverage key indicators to predict and prevent interruption in mission-critical operations. The journey that started more than a year ago as a pilot has evolved into a full-fledged IT data lake and analytics platform for various IT managed areas, including applications, servers, devices, licenses, network, storage, security and workloads. (more…)
In an age when most companies invest to become data-driven, the value of data is increasingly a key criteria for making IT decisions, and the protection of the data becomes paramount to those decisions
When making backup-related decisions, price justification involves the potential capital loss to the organization when a data loss or unavailability occurs. Understanding the value of data and access to that data is key when prioritizing backup technology or even for deciding which infrastructure to protect during a cyber-attack. However, estimating this price is not trivial.
I recently worked on a research project with a team of academic partners at Ben-Gurion University for prioritizing data replication to minimize the monetary loss in the case of a disaster. The method we derived can limit the costs of data loss, and could provide a high return on investment (ROI) of up to one million dollars per incident.
The 2014 EMC Digital UniverseStudy, with research and analysis by IDC, predicts that by 2020 the digital universe will contain nearly as many digital bits as there are stars in the universe.
According to the study, digital growth “is doubling in size every two years and by 2020, the digital universe—the data we create and copy annually—will reach 44 zettabytes, or 44 trillion gigabytes.”
As companies brace for this data tsunami, they are challenged to identify the next business opportunity, improve risk management, customer engagement and sustainability. They will need to become “predictive enterprises” which leverage their data to define their future focus and how to get there. Sifting massive amounts of data to find relevant insights for business will be a continuous process, constantly evolving and adapting to business climate. IT departments need to have a robust framework to manage their organizations’ ambitions and goals.
If your organization is like most, you have multiple business groups seeking to leverage pools of segmented Big Data in various ways to improve their operations, gain insight into customers, target marketing efforts, hone product features and more. Maybe you are even one of the few who have gained some significant value from these siloed business analytics using increasingly popular data science techniques.
However, most organizations, including EMC, still have a way to go to become an analytical enterprise, which bases both tactical and strategic decisions on data and analytics. This does not mean that the decision-making is out of the hands of the leadership of the company and the years of experience they bring, but it does mean that every decision has been critiqued based on what your analysis is telling you.
Project: Root cause analysis of difference in support hours
ROI: Model suggests saving of 500-1,000 support hours on average weekly (up to $5M annually)
I have recently made the transition from academic neuroscience to becoming a member of the Data-Science-as-a-Service team in EMC’s IT organization. The change from academia to the business world is far from trivial. Coming from a computational neuroscience lab, where most of the work involved developing probabilistic models for the activity of neural populations, simulations and implementations were not a top priority. As a data scientist with a mostly theoretical background, coping with implementation, let alone implementation in a Big Data environment, is challenging.
Lucky for me, the change of scientific domains underlying the two disciplines is not as large a “leap” as it may seem at first. When you think about predictive analytics, what is more natural than to think of our brain as a complicated learning machine whose main goal is data compression and interpretation?
IT Proven allows you to leverage Dell IT’s first-hand knowledge and best practices to accelerate your own IT transformation journeys, transforming operations and delivering IT as a Service through the power of cloud computing. IT Proven highlights how Dell IT transformed into an agile, innovative, and competitive service provider.
Big Data is changing the way IT organizations operate and deliver solutions to the business. It is a new, contemporary approach for IT to help business users harness and interpret information to drive more efficiency, productivity, performance and value for the business. As EMC IT embraces Third Platform, we are breaking new ground with Big Data analytics to better position the organization to deliver a more competitive solutions.
EMC CIO Vic Bhagat (@VicBhagat) addressed this topic and more in a recent interview with the Pivotal Blog, tackling the questions, challenges and opportunities facing both EMC IT and global CIOs. Where can IT organizations begin? How can they drive new behaviors? How should they address internal clients?
One of the challenges hardware (and software) manufacturers are facing is estimating the future level of support required in maintaining their products. Underestimating the support requirements would lead to major loses on the support contract while overestimating hurts the competitive edge of the product.
Future level of support includes: replacements, repairs, remote and on-site support. To that end, manufacturers develop reliability models for everything fromhard/flash drives to cars and aircraft. These models take into account different configuration parameters of the final product and its internal components.
Click to Enlarge
In 2007, Google conducted a large-scale analysis for a subset of its drive population. It utilized an environment containing a large number of disk drives, collected different types of data from these drives to a Big Data store (Google’s Bigtable) and conducted an analysis of the different Key Performance Indicators (KPIs) and their correlation with drive mortality:
Manufacturer, Models and Vintage
Self-Monitoring, Analysis and Reporting Technology (M.A.R.T)
Contrary to expectations, Google’s researchers found that these KPIs are more useful for predicting trends for a large population than for predicting a single drive failure.
The opinions and interests expressed on Dell EMC employee blogs are the employees' own and do not necessarily represent Dell EMC's positions, strategies or views. Dell EMC makes no representation or warranties about employee blogs or the accuracy or reliability of such blogs. When you access employee blogs, even though they may contain the Dell EMC logo and content regarding Dell EMC products and services, employee blogs are independent of Dell EMC and Dell EMC does not control their content or operation. In addition, a link to a blog does not mean that EMC endorses that blog or has responsibility for its content or use.