Data Format


The following documentation describes Gnip’s Activity Streams format for Twitter data. For detailed documentation of Twitter’s original data format, please see their documentation or skip to the Sample Payloads section to view a side-by-side comparison of Original and Activity Stream format.

Tweet Activities 

Tweets, Retweets, Quote Tweets, and Deleted Tweets

The following table describes the root-level data structures for Tweet activities (Tweet, Retweet, Quote Tweet, and Deleted Tweet). For fields with multiple levels of sub-fields, click the links provided to reveal details about the sub-fields.

id

A unique IRI for the tweet. In more detail, "tag" is the scheme, "search.twitter.com" represents the domain for the scheme, and 2005 is when the scheme was derived.

When storing Tweets, this should be used as the unique identifier or primary key.

"id": "tag:search.twitter.com,2005:347769243409977344"
                            

actor

An object representing the twitter user who tweeted. The Actor Object refers to a Twitter User, and contains all metadata relevant to that user.
Show Sub-field Details

Property Description
objectType "person"

See here for more detailed information.

"objectType": "person"
                                    			
id IRI for the twitter user

"id": "id:twitter.com:277184168"
                                    			
link permalink

"link": "http:\/\/www.twitter.com\/KidCodo"
                                    			
displayName the user's name

"displayName": "Zach Codo"
                                    			
image the url of the user's icon.

"image": "https:\/\/si0.twimg.com\/profile_images\/3664410292\/1d75c213a572873bf6797c5591475da5_normal.jpeg"
                                    			
summary the user's bio

"summary": null
                                    			
postedTime the user account's creation date.

"postedTime": "2011-04-04T21:31:20.000Z"
                                    			
links[0].href the user's website.

"links":
[
   {
      "href": null,
      "rel": "me"
   }
]
                                    			
location The user provided location. May be a Twitter Place, with a displayName and objectType, or a simple String.

"location":
{
  "objectType": "place",
  "displayName": "Naperville, IL"
}
                                    			
utcOffset The user's timezone offset from utc in seconds

"utcOffset": "-21600"
                                    			
preferredUsername The user's screen name

"preferredUsername": "KidCodo"
                                    			
languages[0] The user's default language

"languages":
[
  "en"
]
                                    			
twitterTimeZone The user's timezone's name.

"twitterTimeZone": "Central Time (US & Canada)"
                                    			
friendsCount Number of people the user follows.

"friendsCount": 64
                                    			
 followersCount Number of followers the user has.

"followersCount": 207
                                    			
listedCount Number of lists the user is in

"listedCount": 1
                                    			
statusesCount Number of tweets the user has tweeted

"statusesCount": 11207
                                    			
verified A true/false condition indicating whether Twitter has classified the posting user's account as "verified"

"verified": false
                                    			
 "actor":
 {
    "objectType": "person",
    "id": "id:twitter.com:277184168",
    "link": "http:\/\/www.twitter.com\/KidCodo",
    "displayName": "Zach Codo",
    "postedTime": "2011-04-04T21:31:20.000Z",
    "image": "https:\/\/si0.twimg.com\/profile_images\/3664410292\/1d75c213a572873bf6797c5591475da5_normal.jpeg",
    "summary": null,
    "links":
    [
       {
          "href": null,
          "rel": "me"
       }
    ]
    "friendsCount": 64,
    "followersCount": 207,
    "listedCount": 1,
    "statusesCount": 11207,
    "twitterTimeZone": "Central Time (US & Canada)",
    "verified": false,
    "utcOffset": "-21600",
    "preferredUsername": "KidCodo",
    "languages":
    [
       "en"
    ],
    "location":
    {
       "objectType": "place",
       "displayName": "Naperville, IL"
    },
       "favoritesCount": 1123
 }
                        

verb

The type of action being taken by the user.

Tweets, "post"
Retweets, "share"
Deleted Tweets, "delete"

The verb is the proper way to distinguish between a Tweet and a true Retweet. However, this only applies to true retweets, and not modified or quoted Tweets, which don't use Twitter's Retweet functionality. For a description of AS verbs click here.

For Deletes, note that only a limited number of fields will be included, as shown in the sample payload below.

 "verb": "post"
                            

generator

An object representing the utility used to post the Tweet. This will contain the name ("displayName") and a link ("link") for the source application generating the Tweet.

"generator":
{
   "displayName": "Twitter for iPhone",
   "link": "http:\/\/twitter.com\/download\/iphone"
}
                            

provider

A JSON object representing the provider of the activity. This will contain an objectType ("service"), the name of the provider ("displayName"), and a link to the provider's website ("link").

"provider":
{
   "objectType": "service",
   "displayName": "Twitter",
   "link": "http:\/\/www.twitter.com"
}
                            

inReplyTo

A JSON object referring to the Tweet being replied to, if applicable. Contains a link to the Tweet.

"inReplyTo":
{
   "link": "http:\/\/twitter.com\/GOP\/statuses\/349573991561838593"
}
                            

location

A JSON object representing the Twitter "Place" where the tweet was created. This is an object passed through from the Twitter platform.
Show Sub-field Details

Property Description
objectType "place"

See here for more detailed information.

"objectType": "place"
                                        
displayName The full name of the place.

"displayName": "Alsip, IL"
                                        
link A link to the full Twitter JSON representation of the place.

"link": "http:\/\/api.twitter.com\/1\/geo\/id\/9fdc3d1edf51a0a0.json"
                                        
geo The GeoJSON bounding box provided by Twitter.

"geo":
{
   "type": "Polygon",
   "coordinates":
   [
      [
         [
            -87.778572,
            41.651233
         ],
         [
            -87.778572,
            41.690948
         ],
         [
            -87.694909,
            41.690948
         ],
         [
            -87.694909,
            41.651233
         ]
      ]
   ]
}
                                        
streetAddress The Street address of the place if available.


                                        
name place.name from Twitter's JSON format

"name": "Alsip"
                                        
"location": 
    {
    "objectType": "place",
    "displayName": "Boulder, CO",
    "name": "Boulder",
    "country_code": "United States",
    "twitter_country_code": "US",
    "link": "https://api.twitter.com/1.1/geo/id/fd70c22040963ac7.json",
    "geo": {
        "type": "Polygon",
        "coordinates":
        [
            [
                [
                    -105.3017759,
                    39.953552
                ],
                [
                    -105.3017759,
                    40.094411
                ],
                [
                    -105.183597,
                    40.094411
                ],
                [
                    -105.183597,
                    39.953552
                ]
            ]
        ]
    },
    "twitter_place_type": "city"
},
                        

geo

Point location where the Tweet was created.

"geo":
{
   "type": "Point",
   "coordinates":
   [
      41.68291626,
      -87.77269682
   ]
}
			

twitter_entities

The entities object from Twitter's data format. It contains lists of urls, mentions and hashtags.

Note that in Retweets, Twitter may truncate the values of entities that it extracts at the root level. So, for Retweets, your app should look at object.twitter_entities to ensure that you are using non-truncated values.
Show Sub-field Details

Property Description
hashtags "service"

"hashtags":
   [
        {
          "text": "snow",
          "indices": [
            66,
            71
          ]
        },
        {
          "text": "skiing",
          "indices": [
            73,
            80
          ]
        }
   ]
                                            
symbols The name of the provider.

"symbols":
   [
        {
            "text": "THIS",
            "indices": [
                82,
                87
            ]
        },
        {
             "text": "THAT",
              "indices": [
                 82,
                 87
              ]
        }

   ]
                                           
urls A link to the provider.

"urls":
   [

       {
         "url": "http:\/\/t.co\/tPLOXjoafD",
         "expanded_url": "http:\/\/goo.gl\/nE2Uqa",
         "display_url": "goo.gl\/nE2Uqa",
         "indices": [
           87,
           109
         ]
       }
   ]
                                           
user_mentions Other Twitter users mentioned in the text of the Tweet.

"user_mentions":
   [
    {
      "screen_name": "TweetMaker",
      "name": "Maker of tweets",
      "id": 1234567890,
      "id_str": "1234567890",
      "indices": [
        3,
        15
      ]
    }
    ]
                                           
media Represents media elements uploaded with the Tweet.

"media":
   [
   {
             "id": 4.8198342372925e+17,
             "id_str": "481983423729254400",
             "indices": [
               105,
               127
             ],
             "media_url": "http:\/\/pbs.twimg.com\/media\/BrBZXscCQBAYC3I.jpg",
             "media_url_https": "https:\/\/pbs.twimg.com\/media\/BrBZXscCQAASC3I.jpg",
             "url": "http:\/\/t.co\/JwO4Y11lpG",
             "display_url": "pic.twitter.com\/JwO4Y51lpG",
             "expanded_url": "http:\/\/twitter.com\/TweetMaker\/status\/481933421943172096\/photo\/1",
             "type": "photo",
             "sizes": {
               "small": {
                 "w": 340,
                 "h": 358,
                 "resize": "fit"
               },
               "medium": {
                 "w": 600,
                 "h": 631,
                 "resize": "fit"
               },
               "large": {
                 "w": 913,
                 "h": 960,
                 "resize": "fit"
               },
               "thumb": {
                 "w": 150,
                 "h": 150,
                 "resize": "crop"
               }
             }
           }
   ]
                                           
  "twitter_entities": {
    "hashtags": [
      {
        "text": "snow",
        "indices": [
          66,
          71
        ]
      },
      {
        "text": "skiing",
        "indices": [
          73,
          80
        ]
      }
    ],
    "symbols": [
      {
        "text": "THIS",
        "indices": [
          82,
          87
        ]
      },
      {
        "text": "THAT",
        "indices": [
          82,
          87
        ]
      }
    ],
    "urls": [
      {
        "url": "http:\/\/t.co\/tPLOXjoafD",
        "expanded_url": "http:\/\/goo.gl\/nE2Uqa",
        "display_url": "goo.gl\/nE2Uqa",
        "indices": [
          87,
          109
        ]
      }
    ],
    "user_mentions": [
      {
        "screen_name": "TweetMaker",
        "name": "Maker of tweets",
        "id": 1234567890,
        "id_str": "1234567890",
        "indices": [
          3,
          15
        ]
      }
    ],
    "media": [
      {
        "id": 4.8198342372925e+17,
        "id_str": "481983423729254400",
        "indices": [
          105,
          127
        ],
        "media_url": "http:\/\/pbs.twimg.com\/media\/BrBZXscCQBAYC3I.jpg",
        "media_url_https": "https:\/\/pbs.twimg.com\/media\/BrBZXscCQAASC3I.jpg",
        "url": "http:\/\/t.co\/JwO4Y11lpG",
        "display_url": "pic.twitter.com\/JwO4Y51lpG",
        "expanded_url": "http:\/\/twitter.com\/TweetMaker\/status\/481933421943172096\/photo\/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 340,
            "h": 358,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 631,
            "resize": "fit"
          },
          "large": {
            "w": 913,
            "h": 960,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          }
        }
      }
    ]
  }

                        

twitter_extended_entities

An object from Twitter's native data format containing "media". This will be present for any Tweet where the twitter_entities object has data present in the "media" field, and will include multiple photos where present in the post. Note that this is the correct location to retrieve media information for multi-photo posts.

Multiple photos are represented by comma-separated JSON objects within the "media" array.

"twitter_extended_entities": {
    "media": [
      {
        "id": 5.0475663942184e+17,
        "id_str": "504756639421837312",
        "indices": [
          64,
          86
        ],
        "media_url": "http:\/\/pbs.twimg.com\/media\/BwFBfT7CMAA-Jct.jpg",
        "media_url_https": "https:\/\/pbs.twimg.com\/media\/BwFBfT7CMAA-Jct.jpg",
        "url": "http:\/\/t.co\/VPXUMLqPKI",
        "display_url": "pic.twitter.com\/VPXUMLqPKI",
        "expanded_url": "http:\/\/twitter.com\/RedSox\/status\/504756640550514688\/photo\/1",
        "type": "photo",
        "sizes": {
          "medium": {
            "w": 600,
            "h": 800,
            "resize": "fit"
          },
          "large": {
            "w": 645,
            "h": 860,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "small": {
            "w": 340,
            "h": 453,
            "resize": "fit"
          }
        }
      },
      {
        "id": 5.0475663942185e+17,
        "id_str": "504756639421853696",
        "indices": [
          64,
          86
        ],
        "media_url": "http:\/\/pbs.twimg.com\/media\/BwFBfT7CcAAYBhR.jpg",
        "media_url_https": "https:\/\/pbs.twimg.com\/media\/BwFBfT7CcAAYBhR.jpg",
        "url": "http:\/\/t.co\/VPXUMLqPKI",
        "display_url": "pic.twitter.com\/VPXUMLqPKI",
        "expanded_url": "http:\/\/twitter.com\/RedSox\/status\/504756640550514688\/photo\/1",
        "type": "photo",
        "sizes": {
          "large": {
            "w": 773,
            "h": 579,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 449,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "small": {
            "w": 340,
            "h": 254,
            "resize": "fit"
          }
        }
      }
    ]
  }
                            

link

A Permalink for the tweet.

"link": "http:\/\/twitter.com\/KidCodo\/statuses\/347769243409977344"
                            

body

The tweet text.

In Retweets, note that Twitter modifies the value of the body at the root level by adding "RT @username" at the beginning, and by truncating the original text and adding an ellipsis at the end. Thus, for Retweets, your app should look at the object.body to ensure that it is extracting the non-modified text of the original Tweet (being retweeted).

"body": "With Cardiff, Crystal Palace, and Hull City joining the EPL from the Championship it will be a great relegation battle at the end."
                            

objectType

"activity"

"objectType": "activity"
                            

object

An object representing tweet being posted or shared.

For Retweets, this will contain an entire "activity", with the pertinent fields described in this schema.

For Original Tweets, this will contain a "note" object, with the fields described here.

Show Sub-field Details

Property Description
objectType "note"

See http://activitystrea.ms/head/activity-schema.html#note for more detailed information.

"objectType": "note"
                                        
id An IRI for the tweet.

"id": "object:search.twitter.com,2005:349575936838082561"
                                        
summary The text of the tweet.

"summary": "\u201c@GOP: .@RNCResearch Obama\u2019s lack of leadership landing him in a \u2018dead zone presidency\u2019 http:\/\/t.co\/Eygp4TRqkF\u201d"
                                        
link The permalink to the tweet.

"link": "http:\/\/twitter.com\/chuckleslong\/statuses\/349575936838082561"
                                        
postedTime The creation time of the tweet.

"postedTime": "2013-06-25T17:12:52.000Z"
                                        
 "object":
 {
    "objectType": "note",
    "id": "object:search.twitter.com,2005:347769243409977344",
    "summary": "With Cardiff, Crystal Palace, and Hull City joining the EPL from the Championship it will be a great relegation battle at the end.",
    "link": "http:\/\/twitter.com\/KidCodo\/statuses\/347769243409977344",
    "postedTime": "2013-06-20T17:33:43.000Z"
 }
                        

postedTime

The time the action occurred, e.g. the time the Tweet was posted.

"postedTime": "2013-06-25T17:12:52.000Z"
                            

Compliance Activities 

Status Delete Activities 

Status Delete Activities occur when a Twitter user deletes a Tweet. The following table describes the format of the root-level object of these activities.

objectType

"activity"


"objectType":"activity"
                            

verb

"delete"


"verb":"delete"
                            

object.id

An object containing the ID of the Tweet being deleted.


"object": {
    "id":"tag:search.twitter.com,2005:361114985893605376"
}
                            

actor.id

The ID of the Twitter user deleting their Tweet.


"actor": {
    "id":"id:twitter.com:593567038"
}
                            

timestampMs

The timestamp of the Delete activity, with millisecond granularity.


"timestampMs":"2014-08-27T23:49:41.735+00:00"
                            

User Delete Activities 

User Delete Activities occur when a Twitter user’s entire account and associated Tweets are deleted. The following table describes the format of the root-level object of these activities.

verb

"user_delete". Occurs where a user deletes their Twitter account.


"verb":"user_delete"
                        

object.id

An object containing the ID of user account being deleted


"object": {
    "id":"tag:search.twitter.com,2012:user/1602954684"
}
                        

timestampMs

The timestamp of the User Delete activity.


"timestampMs":"2014-08-27T23:49:40.532+00:00"
                            

User Undelete Activities 

User Undelete Activities occur when a Twitter user’s account is undeleted. The following table describes the format of the root-level object of these activities.

verb

"user_undelete". This occurs when a Twitter user "un-deletes" their account.


"verb":"user_undelete"
                            

object.id

An object containing the ID of user account being un-deleted.


"object": {
    "id":"tag:search.twitter.com,2012:user/930913333"
}
                            

timestampMs

The timestamp of the user un-delete activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                            

User Protect Activities 

User Protect Activities occur when a Twitter user’s set their account as protected. Accounts with protected Tweets require manual approval of each and every person who may view that account’s Tweets. The following table describes the format of the root-level object of these activities.

verb

"user_protect". This occurs when a Twitter user switches their account from public to "protected".


"verb":"user_protect"
                        

object.id

An object containing the ID of user account being protected.


"object": {
    "id":"tag:search.twitter.com,2012:user/930913333"
}
                        

timestampMs

The timestamp of the user protect activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                        

User Unprotect Activities 

User Unprotect Activities occur when a Twitter user’s set their account as unprotected. Unprotected accounts are public and is the default setting. The following table describes the format of the root-level object of these activities.

verb

"user_unprotect". This occurs when a Twitter user switches their account from "protected" to public.


"verb":"user_unprotect"
                        

object.id

An object containing the ID of user account being unprotected.


"object": {
    "id":"tag:search.twitter.com,2012:user/930913333"
}
                        

timestampMs

The timestamp of the user unprotect activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                        

Scrub Geo Activities 

Scrub Geo Activities occur when a Twitter user removes the geo-location data from their previous Tweets. The following table describes the format of the root-level object of these activities.

objectType

"activity"


"objectType":"activity"
                        

verb

"scrub_geo". This occurs when a Twitter user removes Geo location information from past Tweets.


"verb":"scrub_geo"
                        

actor.id

An object containing the ID of the Twitter User removing Geo location information from their Tweets.


"actor": {
    "id":"id:twitter.com:598851423"
}
                        

target.up_to_id

Receiving this indicates that you should remove the geo data from the user's Tweets, up to the up_to_id value.


"target": {
    "up_to_id":"tag:search.twitter.com,2005:503024492839727104"
}
                        

timestampMs

The timestamp of the activity -- the time at which the user removed their Geo data from past Tweets.


"timestampMs":"2014-08-27T23:51:13.651+00:00"
                        

User Suspend Activities 

User Suspend Activities occur when a Twitter user has violated Twitter Rules, or an account is suspected of being hacked or compromised. The following table describes the format of the root-level object of these activities, in Twitter’s format (not Activity Stream format).

verb

"user_suspend". This occurs when a Twitter user is suspended.


"verb":"user_suspend"
                        

object.id

An object containing the ID of user account being suspended.


"object": {
    "id":"tag:search.twitter.com,2012:user/930913333"
}
                        

timestampMs

The timestamp of the user suspend activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                        

User Unsuspend Activities 

User Unsuspend Activities occur when a Twitter user’s account has been unsuspended. The following table describes the format of the root-level object of these activities.

verb

"user_unsuspend". This occurs when a Twitter user is unsuspended.


"verb":"user_unsuspend"
                        

object.id

An object containing the ID of user account being unsuspended.


"object": {
    "id":"tag:search.twitter.com,2012:user/930913333"
}
                        

timestampMs

The timestamp of the user unsuspend activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                        

User Withheld Activities 

User Withheld Activities occur when a Twitter user’s account has been withheld in specified areas of the world. The following table describes the format of the root-level object of these activities.

verb

"user_withheld". This occurs when a user has been withheld.


"verb": "user_withheld"
                        

object.id

The ID of user account being withheld.


"object": {
        "id": "tag:search.twitter.com,2012:user/1375036644"
}
                        

object.withheld_in_countries

A JSON array of ISO-3166-1 alpha-2 country codes, representing the countries in which the user is withheld.


"object": {
        "withheld_in_countries": ["XY"]
}
                        

timestampMs

The timestamp of the activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                        

Status Withheld Activities 

Status Withheld Activities occur when a user’s Tweet had been withheld. The following table describes the format of the root-level object of these activities.

status_withheld object

"status_withheld". Occurs when Twitter withholds a Tweet from specific countries.


"verb": "status_withheld"
                        

object.id

The ID of the Tweet being withheld.


"object": {
     "id": "tag:search.twitter.com,2005:425794620677554176"
}
                        

object.user_id

The ID of the user whose Tweet is being withheld.


"object": {
     "user_id": "tag:search.twitter.com,2012:user/1375036644"
}
                        

object.withheld_in_countries

Contains a JSON array of ISO-3166-1 alpha-2 country codes. When a Tweet is withheld in all countries, a 'XX' country code will be used. When a Tweet is withheld due to a Digital Media Copyright (DMCA) complaint a 'XY' country code will be used and an additional 'withheld_copyright' attribute will be set to true.


"object": {
     "withheld_in_countries": ["XY"]
}
                        

timestampMs

The timestamp of the activity.


"timestampMs":"2014-08-27T23:49:41.839+00:00"
                        

Sample Payloads 

User Withheld

Status Withheld