Viktor Mayer-Schonberger and Kenneth Cukier, authors of Big Data: A Revolution That Will Transform How We Live, Work and Think, wrote, “If big data teaches us anything, it is that just acting better, making improvements – without deeper understanding – is often good enough.”

EMC IT not only recognizes the hidden value of Big Data, but also strives to generate better outcomes. So, we at EMC IT can act better and faster to improve our customers’ experience.

In his November 2013 article, Dan Inbar from EMC’s IT organization eloquently presented what IT has been doing to improve the operations of our Exchange email environment. PAITO (Predictive Analytics for IT Operations) is our Big Data analytics solution for outage prediction that allows our IT operations team to collect, analyze, store, and leverage key indicators to predict and prevent interruption in mission-critical operations. The journey that started more than a year ago as a pilot has evolved into a full-fledged IT data lake and analytics platform for various IT managed areas, including applications, servers, devices, licenses, network, storage, security and workloads.

The data-driven approach to profile and proactively identify anomalous behavior using metrics and event log data is core to the analytics capabilities of PAITO, a real-time solution that is designed for scalability, handling a high velocity and variety of data ingestion, and supporting self-service streaming analytics as well as historical data analytics. These are essential characteristics to build a sustainable data lake and analytics platform that can adapt to the growth and complexity of our IT environment.
PATIOBuilding a real-time big data analytics platform requires a sophisticated stream data ingestion process. PAITO is designed to collect Exchange server performance counters as well as system and application logs remotely from each of the monitored servers every 120 seconds as data streams into PAITO Data Lake. Performance counters are important for monitoring performance and load, but prediction comes from the information hidden in the system and application logs in conjunction with the performance counters. These logs must be collected in real-time and consumed to predict outcomes. This is one of the most critical components of PAITO.

PAITO’s analytics module extends the platform to facilitate a series of analytics tasks. It uses a deductive learning process to understand the state of the system at any given time. The state of the system is based on a queue of events that have already occurred and, as the system arrives to a particular state, the analytic module continuously determines the prediction of the future states. These predictions are computed in probabilistic outcomes. Anytime a future state scores a higher probability than a pre-defined threshold, the analytic module triggers an alert.

For example, PAITO evaluates performance counters of Windows servers and computes health scores in real-time. If the health score fell below zero for any server following a certain pattern, PAITO would alert support staff immediately to take follow-up actions. In addition, the analytic module follows an adaptive learning technique where historical data is used to discover new states of the system, which becomes an input for the prediction algorithm.

The ability to scale and process a large volume of data, both structured and unstructured, enabled with stream data processing and rapid data access, requires a data processing layer combined with both in-memory computing and high-speed MPP (massive parallel processing) framework. Our GreenPlum MPP database combined with Hadoop Distributed File System (HDFS) and real-time analytical model execution engine on Strom enables PAITO with a scalable real-time analytical platform.

In addition to information extraction and analytic prediction, PAITO’s data visualization features are valuable to understand historical trends and scope for future improvements.

In conclusion, PAITO connects Big Data and analytics through the convergence of complex data and data-driven decision-making capabilities. It has enabled both business and IT to ask the right questions about the data sitting in IT platforms. As a result, it has fostered new ways to discover metrics to measure performance, and proactive monitoring and prediction of critical services. Looking into the journey, it is a paradigm shift for us in enabling our command center with insightful information and transforming our Exchange platform to a self-monitored proactive system. The result is a unique opportunity to serve our customers better.

Bhanu Dhanaraj

Bhanu Dhanaraj

Sr. Manager, Enterprise Analytics, EMC IT
Bhanu Dhanaraj

Latest posts by Bhanu Dhanaraj (see all)

Tags: , , ,

4 Comments

  1. I know this website provides quality dependent articles and other stuff, is there any other web site which presents these
    kinds of data in quality?

  2. Well great post thanks for sharing this article very useful to all.

  3. Nice analytics method to continue with the business journey. Thanks for sharing such great post.

Leave a Comment

Comments are moderated. Dell EMC reserves the right to remove any content it deems inappropriate, including but not limited to spam, promotional and offensive comments.

Follow Dell EMC

IT PROVEN MICROSITE:

Recent Tweets

You asked, we delivered. Our Future-Proof Storage Loyalty Program has expanded, thanks to your feedback. See the be… https://t.co/QTsZB0IfN7 about 21 mins ago
RT @DellEMCServers: The PowerEdge R740xd has earned the first-ever @storagereview Editor's Choice award! Take an in-depth look at this pow… about 1 hour ago
All-Flash solutions are not created equal. @PrincipledTech evaluated VMAX and others in head-to-head tests of handl… https://t.co/W9ZVogZYHo about 6 hours ago