IU X Informatics Unit 5 Lesson 4 Features of Data Deluge II


This right here is some Bina,
again, a site from Bina who is pointing out some features that
we need to integrate together for data intensive applications. And it’s actually a mix
of data sources and technology here so we know that
social networking sites like Facebook are an incredibly
rich source of data. In fact, Facebook announced
a new graph search algorithm for adding value to their
to their websites. Next topic here is mashups, so
mashups are very important. They’re sometimes called workflow
and they basically are a way of taking different services, with
member services their transformation of the filters that take data
from one form to another form. And so you have multiple services,
you join them together, and you get a new service. And there is a very famous
website program mobile web dot com which has lots and lots of mash ups there. And so
mash ups are very important and intrinsic to the big data
revolution is the use of mash ups or using services together
to get new results. The interface to all of
this is portals and, interesting, some important
technologies, like Wikis. Wikis allow you to collaborate
on generating data, and so they’re just one source of big data,
or one source of documenting
collaborative work on big data. And, we obviously have on the
internet many media sharing sites, which like YouTube which
have enormous amounts of data associated with them. They are part of the big
data revolution. Online gaming is a good
example of the explosion. Biology and space science obviously are fields that have a lot of data associated with them. Actually the space is an example
where probably if it’s really space then it comes from Mars or the moon. Because we’re especially from Mars, the communication
bandwidth is very small. You will not find petabytes
of data coming from Mars. Jet Propulsion Lab,
which runs those explorations. They put a huge amount of effort
into the processing of this data cuz it’s so difficult to get
and so hard to transport from Mars that what you see is a case
where actually a better outcome is very important cuz you
cannot add more data. So here’s my list of possible
types of data deluge applications, which effectively we’ve
gone through this example. We started off with a traditional
business transaction data, like stock market data
an credit card data. We have interaction data from
social networking sites, like LinkedIn and Facebook. We have information retrieval where
there’s also sophisticated things such as language translation. I have noted our language
translation is a good example of the big data revolution. Mainly, rather than doing
sophisticated language technologies what Google found
was they had from their, collection of data they had many,
many sources of data where they had that
data in multiple languages. So they could do a sort of
look up form of translation, where they learned from
these existing translations, how to translate new data. So that again is a good example
of the impact of the data deluge. Because you collect together in one
place then we Google archives and obviously, other people’s archives. But multiple documents, each of
which existed in multiple languages, a different approach to language
translation was possible. We pointed out recommended systems. We’ll actually discuss those later
on in the class is a use case. There was the example of the Walmart scientist who actually taught the
recommender systems in the class, and how the use of more data was
better than the bigger algorithm. So recommender systems
are important. They span from Netflix to
Career Builder, and on Monster.com, where you recommend the systems,
kinda matching users to movies, to employers to employees,
or friends to people and LinkedIn matches your
contacts to you, tells you when you have, when you should or Facebook of course does the same. They suggest friends. So friends matching is an example
of a recommender system. And of course ecommerce sites like
Amazon make huge use of recommender systems to try to recommend
to you what you should buy. We have marketing information. This is optimizing pricing or
optimizing store placement that is changing the operation
of supermarkets. We discussed the internet of things,
so that’s another important example. Pervasive sensors,
intelligent everything. Then we have the large and possibly
smaller scientific instruments varying from satellites, telescopes,
accelerators and gene sequencers. And there is an area,
which is sort of, not necessarily distinct cuz it uses
some of the same techniques and systems as the previous
commercial and academic examples. The military obviously has
important uses of sensors and satellites in the data deluge.

Leave a Reply

Your email address will not be published. Required fields are marked *