Social  Web:  Where  are  the  Semantics?

Tutorial within the ESWC 2014

Code Snippets: Data acquisition

Task 1.1: Twitter data acquisition

There are several APIs for accessing data from different languages. You can post a Tweet, get the timeline (posts of a user), send/receive direct messages, search for tweets or opening a stream. We shall see one of the simplest and fanciest API, Twitter4J

In order to access the API you should register first your application: https://twitter.com/oauth_clients/new

Snippet #1: Make a search in Twitter

Download and add the required twitter4j-core-4.0.1.jar library to your project. Secrets: q,H. The following snippet makes a search of tweets (mixed of famous and recent) containing the hashtag "#eswc2014".

        ConfigurationBuilder config = new ConfigurationBuilder();
        Twitter twitter = new TwitterFactory(config.build()).getInstance();
        Query query = new Query("#eswc2014");
        QueryResult result = twitter.search(query);
        for (Status status : result.getTweets()) {
            System.out.println("@" + status.getUser().getScreenName() + "\t" + status.getText());
The code above should throw an output like this:
@m1ci	If my paper get accepted @iswc2014 I will run a marathon! Anybody wants to join? :-) #iswc #iswc2014 #marathon4iswc 
@Stavros_IT	RT @ieeepervasive: RT @ubicomp: Apply by June 1 if you want to be a student volunteer at #UbiComp or #ISWC. http://t.co/uCX03Sb5T8 http://t…
@ieeepervasive	RT @ubicomp: Apply by June 1 if you want to be a student volunteer at #UbiComp or #ISWC. http://t.co/uCX03Sb5T8 http://t.co/IUs3VzU6Hz
@LloydRutledge	RT @pascal_molli: 3rd Workshop SWCS@ISWC2014 web site is available at http://t.co/TMZ52A6QuH #ISWC #ISWC2014 #LinkedData
@DataScienceLINA	RT @pascal_molli: 3rd Workshop SWCS@ISWC2014 web site is available at http://t.co/TMZ52A6QuH #ISWC #ISWC2014 #LinkedData
@pascal_molli	3rd Workshop SWCS@ISWC2014 web site is available at http://t.co/TMZ52A6QuH #ISWC #ISWC2014 #LinkedData

Task 1.2: Facebook data acquisition

In this task we will acquire some data from Facebook open groups and Facebook Pages by using the Facebook API via restfb, a Java client. In order to access the Facebook API you should create your application and obtain your APP_ID and APP_SECRET: https://developers.facebook.com/

Snippet #1.2: Collect data from the ESWC and ISWC conferences facebook page

We have prepared some easy code for you to play with :) The Facebook data collector is available at FbDataCollectorGist This Gists contains three main files:
  • FacebookDataCollector.java: contains the code that you need to download data from Facebook open groups or Facebook pages
  • pom.xml: contains the dependencies. If you prefer not to use a maven project, just go to restfb and download the corresponding library
  • fbCollector.properties: this is the properties file that you need to set up, including
    • appId and appSecret. These are your facebook credentials. To get them you need go to applications
    • FbGroups and FbPages. These are the lists of groups and pages for which you want to download information. If you have more than one group or one page, please provide coma separated values
    • maxPosts. This is the maximum number of [initial] posts that you want to download for each group/page. Note that for each initial posts all of its corresponding comments will also be included in the download
Just for you to understand a bit the code, connecting to Facebook is very easy. You just need to obtain an access token using your Facebook credentials and use this token to create a Facebook client. See below :)

    FacebookClient.AccessToken accessToken = new DefaultFacebookClient().obtainAppAccessToken(appId, appSecret);
    String myAccessToken = accessToken.getAccessToken();
    this.facebookClient = new DefaultFacebookClient(myAccessToken);
Once created, the Facebook client is used to fetch the content of the group

    Connection pageFeed = facebookClient.fetchConnection(fbGroupId + "/feed", Post.class);
The FacebookDataCollector class provides four files as output.
  • posts.csv: contains all the posts and comments that have been downloaded including their text, time, user who generate them and group or page from where they come from
  • replies.csv: contains the reply change, i.e., information of which posts have been generated as comments to other posts
  • groups.csv: contains information about the groups from which information have been downloaded
  • users.csv: contains information of the users that have generated the collected posts and comments

Task 1.3: Flickr data acquisition

We are not using this in the tutorial. Only for your reference!

Flickr also provides a well-documented API. A manner of acquiring photos (and their metadata) is provided in the ManagerFlickr Java class. Again, you'll have first to register your application by Flickr here, as Flickr also uses OAUTH2.0 (the provided snippet will *not* work with the default key).

You might have made use of any of the non-official libraries, for example FlickrJ. Yet, the provided class does not require any library and shows how to directly invoke the HTTP REST Flickr API.

A search method is provided; the authenticate method is needed before any other. The class contains only another private method.

     * This method authenticates in the Flickr API. You'll need an API Key and a Secret Key
     * Find it here: https://www.flickr.com/services/apps/create/apply
     * @return true on a sucessfull authentication
    public static boolean authenticate(String _apiKey, String _secretKey);
     * Shows photo metadata in a given area, with a certain prcession
     * This calls requires a previous authorisation
     * The default values 0, 0, 0 will just show some photos from Anissaras
     * @param lat Latitude
     * @param lon Longitude
     * @param precisision Precision. 11=city-level, 15=street level, etc.
     * @return Nothing. Metadata info (tags, location, author, etc.) of a set of photos 
     * will be shown in stdout. Tune the function to use the results otherwise!
    public static void search(double lat, double lon, int precision);

License: The contents in this page are licensed under a CC-BY license. Disclaimer We provide this code without any warranty. Use it at your own risk.

Photo used under Creative Commons CC-BY license from youasamachine