Welcome to yet another series of
MTGQOTD (My Techie Guy Question Of The Day) and today's question is BIG DATA?
We
shall start with the basic definition of BIG DATA, and i compiled a list of few definitions from
different sources but they pretty much mean the same thing.
What is BIG DATA?
Source 1: Google
"Big data" is extremely large data sets that may be
analyzed computationally to reveal patterns, trends, and associations,
especially relating to human behavior and interactions.
Source 2: SAS
“Big data”
is a term that describes the large volume of data – both structured and
unstructured – that inundates a business on a day-to-day basis. But it’s not
the amount of data that’s important. It’s what organizations do with the data
that matters. Big data can be analyzed for insights that lead to better
decisions and strategic business moves.
Source
3: Tech Target
“Big data”
is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information.
Source 4: Webopedia
“Big Data” is a phrase used to mean a massive volume
of both structured and unstructured data that is so large and difficult to process using
traditional database and software techniques. In most enterprise
scenarios the volume of data is too big or it moves too fast or it exceeds current
processing capacity.
Source 5: Oracle
“Big
data” describes a holistic information management strategy that includes and
integrates many new types of data and data management alongside traditional
data.
Big data is well defined by 4Vs:
Volume – Big data is associated with humongous amounts of data.
Velocity – Data streams in at fast rate and also needs to be
processed at a fast rate.
Variety – Data comes in all types of formats; structured,
numeric, semi-structured and unstructured data types such as text, audio,
video, email, images etc
Value – Using analytic tools and data mining techniques, data
is analysed for insights that lead to better decisions and strategic business
moves, and that’s the intrinsic value of data.
So why is big data a big deal?
In a
world where everything is or will be connected (IoT), data will be the driving
force and this will not depend on just how much data you gather but also on
what you do with it. Big data combined with powerful analytic tools and
techniques will be used to find answers to real life problems, for example:
1. Reduce on the costs of production
2. Reduce on the time and its associated costs
3. Market driven product development
4. Smart decisions based on facts
Basically,
the gold of the digital age is “data”
and the giants like Google, Apple, Amazon, Microsoft, Facebook are already
mining it and cashing in, no wonder they are the top most valuable listed
companies in the world and they collectively bagged $25bn in net profit in the
first quarter of 2017.
So where does big data come from?
“Big data”
is all around us; social media data, connected devices (IoT devices), mobile
apps, publicly available government databases, operating systems, browsers, and
much more. Don’t be surprised when I tell you that you willing supply this
valuable raw material called “big data” and you give it out
for free.
I
will give you a few real-life examples of how you freely give out this raw material
to the data miners:
1. Your Facebook or Twitter Account – You voluntary open up an account,
provide your biodata and geographical information, for example where you live
and where you work. This is an example of structured data that you freely give
to social media sites. As a single person, it wouldn’t make sense, but as I write this post, Facebook has approximately 1.86
billion monthly active users and by the time I publish this article, may be 1
million new accounts will be created. So now you know what I meant by “big
data” comes in “Volumes”.
2. Your Facebook status update e.g “am having coffee at with my friends
at - location X”, you have just supplied a string of unstructured data to
facebook and added your location, and if you are very keen you will start
seeing adverts on you facebook page that closely relate to what you just posted.
What facebook did is take that string of information you call “status update”,
mine it and sell it to Ad buyers.
3. Browser data – A few days ago I opened my chrome browser
and started searching for flights from Jakarta to Entebbe, and before I
knew it, adds of Qatar airlines were showing up on my facebook page. So, what exactly happened in the
background? Google chrome took my search history, mined it and sold it to Qatar
airlines (who in this case becomes the Ad buyer).
The
daily examples are infinite and all these apps we use daily are used as
channels to collect data from its users and sell it to data miners.
F.Y.I
am not writing this to scare you or to discourage you from using these apps,
but just to make you aware of what happens around us and behind us. It’s the
digital age, and you have to embrace it because at the end of the day these
apps are here to make our lives better.
Big
data combined with Internet of Things (IoT) is today used by businesses and
industries to do a lot of powerful stuff. Connected devices send data in real
time and this data can be analysed and used to make smart business decisions.
So, who uses big data?
Banks – to make smart financial decisions, detect money laundering,
detect fraud before it happens, forecast losses or profits before they happen
and much more. They do this by collecting data from all financial transactions.
Healthcare – Big data combined with IoT is today used by health care
to detect illnesses before they happen. By wearing a small sensor, you are able
to send real time data about your heart beat, temperature, pressure, to the
hospital database which they can analyse and detect any defects before the
situation worsens.
Education – This is another area where you have to deal with vast
amounts of data; students biodata, financial records, wellbeing, academic
progress, social life. Schools are always looking for a smart way of tracking
all this.
Governments – Managing public utilities, managing employees,
controlling traffic, preventing crime, extending services, relief, security and
much more.
Manufacturing – Combined with IoT devices to supply data, processing
lines can now detect failures before they happen, optimize their production,
schedule maintenance and much more.
Retail – Big data analytics can be used to target potential
customers and also to develop customer retention programs.
Big data tools and where to go to get
started?
Apache Hadoop - The Apache Hadoop software library is a framework that allows for the distributed
processing of large data sets across clusters of computers using simple
programming models. It is designed to scale up from single servers to thousands
of machines, each offering local computation and storage.
And here a few videos I handpicked from YouTube to
give you a brief overview of Hadoop and the traditional SQL approach vs Big
data (NoSQL)
No comments:
Post a Comment