Search My Techie Guy

Wednesday, May 17, 2017

MyTechieGuy Question Of The Day - What is BIG DATA and why is it a big deal?

Welcome to yet another series of MTGQOTD (My Techie Guy Question Of The Day) and today's question is BIG DATA?



We shall start with the basic definition of BIG DATA, and i compiled a list of few definitions from different sources but they pretty much mean the same thing.

What is BIG DATA?

Source 1: Google

"Big data" is extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.

Source 2: SAS

Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.

Source 3: Tech Target

Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information.

Source 4: Webopedia

“Big Data” is a phrase used to mean a massive volume of both structured and unstructured data that is so large and difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing capacity.

Source 5: Oracle

“Big data” describes a holistic information management strategy that includes and integrates many new types of data and data management alongside traditional data.


Big data is well defined by 4Vs:

Volume – Big data is associated with humongous amounts of data.
Velocity – Data streams in at fast rate and also needs to be processed at a fast rate.
Variety – Data comes in all types of formats; structured, numeric, semi-structured and unstructured data types such as text, audio, video, email, images etc
Value – Using analytic tools and data mining techniques, data is analysed for insights that lead to better decisions and strategic business moves, and that’s the intrinsic value of data.

So why is big data a big deal?

In a world where everything is or will be connected (IoT), data will be the driving force and this will not depend on just how much data you gather but also on what you do with it. Big data combined with powerful analytic tools and techniques will be used to find answers to real life problems, for example:
1.    Reduce on the costs of production
2.    Reduce on the time and its associated costs
3.    Market driven product development
4.    Smart decisions based on facts

Basically, the gold of the digital age is “data” and the giants like Google, Apple, Amazon, Microsoft, Facebook are already mining it and cashing in, no wonder they are the top most valuable listed companies in the world and they collectively bagged $25bn in net profit in the first quarter of 2017.

So where does big data come from?

Big data is all around us; social media data, connected devices (IoT devices), mobile apps, publicly available government databases, operating systems, browsers, and much more. Don’t be surprised when I tell you that you willing supply this valuable raw material called “big data” and you give it out for free.

I will give you a few real-life examples of how you freely give out this raw material to the data miners:

1. Your Facebook or Twitter Account – You voluntary open up an account, provide your biodata and geographical information, for example where you live and where you work. This is an example of structured data that you freely give to social media sites. As a single person, it wouldn’t make sense, but as I write this post, Facebook has approximately 1.86 billion monthly active users and by the time I publish this article, may be 1 million new accounts will be created. So now you know what I meant by big data comes in “Volumes”.

2. Your Facebook status update e.g “am having coffee at with my friends at - location X”, you have just supplied a string of unstructured data to facebook and added your location, and if you are very keen you will start seeing adverts on you facebook page that closely relate to what you just posted. What facebook did is take that string of information you call “status update”, mine it and sell it to Ad buyers.

3. Browser data – A few days ago I opened my chrome browser and started searching for flights from Jakarta to Entebbe, and before I knew it, adds of Qatar airlines were showing up on my facebook page. So, what exactly happened in the background? Google chrome took my search history, mined it and sold it to Qatar airlines (who in this case becomes the Ad buyer).

The daily examples are infinite and all these apps we use daily are used as channels to collect data from its users and sell it to data miners.
F.Y.I am not writing this to scare you or to discourage you from using these apps, but just to make you aware of what happens around us and behind us. It’s the digital age, and you have to embrace it because at the end of the day these apps are here to make our lives better.

Big data combined with Internet of Things (IoT) is today used by businesses and industries to do a lot of powerful stuff. Connected devices send data in real time and this data can be analysed and used to make smart business decisions.

So, who uses big data?

Banks – to make smart financial decisions, detect money laundering, detect fraud before it happens, forecast losses or profits before they happen and much more. They do this by collecting data from all financial transactions.

Healthcare – Big data combined with IoT is today used by health care to detect illnesses before they happen. By wearing a small sensor, you are able to send real time data about your heart beat, temperature, pressure, to the hospital database which they can analyse and detect any defects before the situation worsens.

Education – This is another area where you have to deal with vast amounts of data; students biodata, financial records, wellbeing, academic progress, social life. Schools are always looking for a smart way of tracking all this.

Governments – Managing public utilities, managing employees, controlling traffic, preventing crime, extending services, relief, security and much more.

Manufacturing – Combined with IoT devices to supply data, processing lines can now detect failures before they happen, optimize their production, schedule maintenance and much more.

Retail – Big data analytics can be used to target potential customers and also to develop customer retention programs.

Big data tools and where to go to get started?

Apache Hadoop - The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

And here a few videos I handpicked from YouTube to give you a brief overview of Hadoop and the traditional SQL approach vs Big data (NoSQL)





No comments: