Changes

Twitter (edit)

Revision as of 13:18, 2 July 2020

71 bytes removed , 13:18, 2 July 2020

This article is part of the [[Advanced User's Guide]]. It is about the usage of BaseX for processing and storing the live data stream of Twitter. We illustrate some statistics about the Twitter data and the performance of BaseX.

As ~~[http://twitter.com~~ Twitter] attracts more and more users (over 140 million active users in 2012) and is generating large amounts of data (over 340 millions of short messages ('tweets') daily), it became a really exciting data source for all kind of analytics. Twitter provides the developer community with a set of [https://~~dev~~developer.twitter.com/start APIs] for retrieving the data about its users and their communication, including the ~~[https://dev.twitter.com/docs/streaming-apis~~ Streaming API] for data-intensive applications, the [https://dev.twitter.com/docs/using-search Search API] for querying and filtering the messaging content, and the [https://dev.twitter.com/docs/api REST API] for accessing the core primitives of the Twitter platform.

= BaseX as Twitter Storage=

For retrieving the Twitter stream we connect with the Streaming API to the endpoint of Twitter and receive a never -ending tweet stream. As Twitter delivers the tweets as [https://www.json.org/ JSON] objects , the ~~objects has to be~~data is converted into XML fragments. For this purpose , the parse function of the [[JSON Module|XQuery JSON Module]] is used. In the examples section both versions are shown ([[#Example Tweet (JSON)|tweet as JSON]] and [[#Example Tweet (XML)|tweet as XML]]). For storing the tweets including the meta-data, we use the standard ''insert'' function of [[Updates|XQuery Update]].

=Twitter’s Streaming Data=

CG

Bureaucrats, editor, reviewer, Administrators

13,550

edits

Changes

Twitter (edit)

Revision as of 13:18, 2 July 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools