Welcome to the first installment of a series of Q&A blog posts. My name is Daniela Williams, Project Manager at ISE, and I'll be asking ISE team members questions to help give insight into what makes us tick.
First up is our website cover model, Matt Coventon (yep, all the people in our website photos are actual team members). Matt is a Senior Engineer and the practice lead for the Big Data professional service. So sit back, learn a bit about Big Data, and get a sneak peek at Matt's presentation at the In-Memory Computing Summit comingup on May 23rd-24th.
Getting Started with Big Data
Q. How long have you been at ISE and what did you do prior?
Matt: I'm proud to say I've been at ISE for nearly 10 years. And I'm thankful for the many opportunities I've had during those years to truly make a difference for our customers. I'm also very thankful to come to work everyday in such an energetic and positive environment. Prior to ISE, I was a Software Engineer at a telecommunications company where I helped build applications to power their sales force.
Q. How would you define Big Data?
Matt: Big Data is a set of technologies and skills that allows you to process, analyze and extract value from data that is either too large, moving too fast, or too varied in its structure to be processed using traditional methods. That's commonly referred to as the 3 V's: Volume, Velocity, and Variety. It's important to note that Big Data is not just new technology, but it also represents a new skill set to apply that technology. For example, Hadoop and MapReduce were revolutionary technologies but they required a new understanding of how to break down your data problem into the proper MapReduce processing steps. In addition, Big Data is not a replacement for traditional databases (think SQL Server or Oracle), rather it is a new, complementary set of tools that allows us to discover new insights and create new products from data. Finally, Big Data doesn't equal Hadoop. The Hadoop ecosystem is certainly the most prominent technology example but there is much more such as noSQL databases including MongoDB and Cassandra, streaming data frameworks like Spark and Flink, and even alternative resource management technologies like Mesos.
Q. What got you excited about Big Data?
Matt: My background is in large-scale enterprise systems that are able to support high volumes of transactions. Whenever I was building a new web application or enterprise middleware I was the guy thinking, "How many events will this be able to process? How are we going to make this simple to scale?" I loved the challenge of designing high performance, distributed systems, then testing and tuning them. Big Data just takes all those things I enjoy to a whole new level.
Applying Big Data Knowledge
Q. How are our customers using Big Data?
Matt: The main way that we're using Big Data with our customers today is in the IoT space. We help our customers collect large volumes of sensor data in both streaming and batch workflows. With our strong background in vehicle telematics it's only natural that we would apply Big Data in that space. We help our customers monitor vehicles and understand how they can improve the performance of the vehicle as well as of the driver. Manufacturing is another area of focus for us. Big Data and Machine Learning can help our customers monitor production quality and predict then avoid manufacturing downtime.
Q. What advantages does Big Data offer?
Matt: The main advantage of Big Data is that size is no longer the limiting factor. In the past we would only keep data for a certain period of time, and we would only analyze a subset of data because that's all we could fit into our disk, CPU, and memory constrained systems. With Big Data you can distribute the storage and processing of data in such a manner that size is no longer a limiting factor. More than that, analysis on large data sets can be accomplished much more quickly this way. This literally opens up new value streams, new product and service possibilities that simply were not possible before. It has also energized the areas of Machine Learning and Neural Networks (Deep Learning), because more data combined with more processing power (especially in the Cloud) is allowing us to build better algorithms and predictive models.
Q. Where do you see the industry going and how do you think it will transform business practices?
Matt: One key trend is reducing the cycle time from data collection to insights to action. A lot of people call this Fast Data. The point is that I don't want to wait for a critical insight. I want insights delivered in time for me to take action before it's too late. Today we see streaming data processing maturing at a fast pace with new frameworks becoming available (such as Apache Apex or Kafka Streams) as well as efforts to standardize the API and streaming design patterns (for example, Apache Beam). In addition, in-memory computing frameworks are becoming more prevalent (such as Apache Ignite). These frameworks store data in RAM to dramatically improve processing speeds and will only grow in popularity as the cost of RAM continues to go down.
Another interesting trend is what I call commodity Machine Learning. What I mean by that is that Machine Learning is becoming more accessible to non-specialized software developers and analysts. Every major cloud provider has Machine Learning as a service. These services make the process of selecting, training, optimizing, and deploying Machine Learning applications an easier task, which means that its simpler than ever to add smarts to a mobile or web application.
Finally, a trend that is particularly interesting to me because of its disruptive power is the Conversational UI. While this is not only a Big Data trend, it would not be possible without Big Data. The Conversational UI or chat interface is all about enabling you to use familiar messaging interfaces to perform tasks that we used to do with a handful of apps on our smart phone. For example, it's easier to transfer money by typing "transfer $50 from savings to checking" in my messenger app that I use multiple times a day, than navigating the banking app that I use once a week. And when I'm done with that I can easily order a pizza with a simple "order a large pepperoni pizza from My Favorite Pizza Place." What's great is that at this point in the conversation My Favorite Pizza Place could respond with "Would you like a second large pizza at half price?" And all this could be done without human intervention thanks to Big Data.
Q. Why did you decide to speak at the In-Memory Compute Summit?
Matt: As I mentioned before, in-memory computing is an important trend in Big Data that is allowing us to gain insights before it's too late to take action. This speed to insight allows us to create new data products that provide incredible value to our customers. The In-Memory Computing Summit is an event dedicated to this. One of the leading open source projects for in-memory computing is Apache Ignite, which is a high performance, integrated, and distributed in-memory platform for computing and transacting on large-scale data sets in real-time. It offers many features, and I thought it would be interesting to give a quick tutorial on streaming and complex event processing using Apache Ignite.
Q. What does ISE offer in Big Data?
Matt: There are four different services we provide in Big Data:
- First of all, we offer enterprise architecture and planning which is important to ensure you have the right Big Data technologies positioned within your enterprise and a plan on how to implement that architecture.
- We also offer our customers full implementation and integration services. This is really for those companies that have not yet adopted Big Data and are starting from the ground up.
- If you have a Big Data implementation already and you need a new application within your ecosystem we have trained software engineers that can assist.
- Finally, and maybe most importantly, we provide analytics, machine learning, and visualization services. We work with the customer to understand the questions they are trying to answer and problems they are trying to solve with data and build tools, algorithms, and visualizations to achieve their goals.
Q. What do you do outside of work for fun?
Matt: We have four kids, so I have a lot of fun with them. I enjoy running, but in moderation. Maybe the most interesting thing I do outside of work is that I'm a songwriter. I really enjoy the creativity and the hard work that goes into writing a song. It's a rewarding process.
Do you have a Big Data project that ISE can help you with? Contact us! Interested in attending the In-Memory Computing Summit? Register here and use code SeeMeSpeak for a discount on general admission passes.