Data Format
The following documentation describes Gnip’s Activity Streams format for Twitter data. For detailed documentation of Twitter’s original data format, please see their documentation or skip to the Sample Payloads section to view a side-by-side comparison of Original and Activity Stream format.
Tweet Activities
Tweets, Retweets, Quote Tweets, and Deleted Tweets
The following table describes the root-level data structures for Tweet activities (Tweet, Retweet, Quote Tweet, and Deleted Tweet). For fields with multiple levels of sub-fields, click the links provided to reveal details about the sub-fields.
|
id |
A unique IRI for the tweet. In more detail, "tag" is the scheme, "search.twitter.com" represents the domain for the scheme, and 2005 is when the scheme was derived. When storing Tweets, this should be used as the unique identifier or primary key.
"id": "tag:search.twitter.com,2005:347769243409977344"
|
||||||||||||||||||||||||||||||||||||||
|
actor |
An object representing the twitter user who tweeted. The Actor Object refers to a Twitter User, and contains all metadata relevant to that user.
"actor":
{
"objectType": "person",
"id": "id:twitter.com:277184168",
"link": "http:\/\/www.twitter.com\/KidCodo",
"displayName": "Zach Codo",
"postedTime": "2011-04-04T21:31:20.000Z",
"image": "https:\/\/si0.twimg.com\/profile_images\/3664410292\/1d75c213a572873bf6797c5591475da5_normal.jpeg",
"summary": null,
"links":
[
{
"href": null,
"rel": "me"
}
]
"friendsCount": 64,
"followersCount": 207,
"listedCount": 1,
"statusesCount": 11207,
"twitterTimeZone": "Central Time (US & Canada)",
"verified": false,
"utcOffset": "-21600",
"preferredUsername": "KidCodo",
"languages":
[
"en"
],
"location":
{
"objectType": "place",
"displayName": "Naperville, IL"
},
"favoritesCount": 1123
}
|
||||||||||||||||||||||||||||||||||||||
|
verb |
The type of action being taken by the user.
"verb": "post"
|
||||||||||||||||||||||||||||||||||||||
|
generator |
An object representing the utility used to post the Tweet. This will contain the name ("displayName") and a link ("link") for the source application generating the Tweet.
"generator":
{
"displayName": "Twitter for iPhone",
"link": "http:\/\/twitter.com\/download\/iphone"
}
|
||||||||||||||||||||||||||||||||||||||
|
provider |
A JSON object representing the provider of the activity. This will contain an objectType ("service"), the name of the provider ("displayName"), and a link to the provider's website ("link").
"provider":
{
"objectType": "service",
"displayName": "Twitter",
"link": "http:\/\/www.twitter.com"
}
|
||||||||||||||||||||||||||||||||||||||
|
inReplyTo |
A JSON object referring to the Tweet being replied to, if applicable. Contains a link to the Tweet.
"inReplyTo":
{
"link": "http:\/\/twitter.com\/GOP\/statuses\/349573991561838593"
}
|
||||||||||||||||||||||||||||||||||||||
|
location |
A JSON object representing the Twitter "Place" where the tweet was created. This is an object passed through from the Twitter platform.
"location":
{
"objectType": "place",
"displayName": "Boulder, CO",
"name": "Boulder",
"country_code": "United States",
"twitter_country_code": "US",
"link": "https://api.twitter.com/1.1/geo/id/fd70c22040963ac7.json",
"geo": {
"type": "Polygon",
"coordinates":
[
[
[
-105.3017759,
39.953552
],
[
-105.3017759,
40.094411
],
[
-105.183597,
40.094411
],
[
-105.183597,
39.953552
]
]
]
},
"twitter_place_type": "city"
},
|
||||||||||||||||||||||||||||||||||||||
|
geo |
Point location where the Tweet was created.
"geo":
{
"type": "Point",
"coordinates":
[
41.68291626,
-87.77269682
]
}
|
||||||||||||||||||||||||||||||||||||||
|
twitter_entities |
The entities object from Twitter's data format. It contains lists of urls, mentions and hashtags.
"twitter_entities": {
"hashtags": [
{
"text": "snow",
"indices": [
66,
71
]
},
{
"text": "skiing",
"indices": [
73,
80
]
}
],
"symbols": [
{
"text": "THIS",
"indices": [
82,
87
]
},
{
"text": "THAT",
"indices": [
82,
87
]
}
],
"urls": [
{
"url": "http:\/\/t.co\/tPLOXjoafD",
"expanded_url": "http:\/\/goo.gl\/nE2Uqa",
"display_url": "goo.gl\/nE2Uqa",
"indices": [
87,
109
]
}
],
"user_mentions": [
{
"screen_name": "TweetMaker",
"name": "Maker of tweets",
"id": 1234567890,
"id_str": "1234567890",
"indices": [
3,
15
]
}
],
"media": [
{
"id": 4.8198342372925e+17,
"id_str": "481983423729254400",
"indices": [
105,
127
],
"media_url": "http:\/\/pbs.twimg.com\/media\/BrBZXscCQBAYC3I.jpg",
"media_url_https": "https:\/\/pbs.twimg.com\/media\/BrBZXscCQAASC3I.jpg",
"url": "http:\/\/t.co\/JwO4Y11lpG",
"display_url": "pic.twitter.com\/JwO4Y51lpG",
"expanded_url": "http:\/\/twitter.com\/TweetMaker\/status\/481933421943172096\/photo\/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 358,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 631,
"resize": "fit"
},
"large": {
"w": 913,
"h": 960,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
}
}
}
]
}
|
||||||||||||||||||||||||||||||||||||||
|
twitter_extended_entities |
An object from Twitter's native data format containing "media". This will be present for any Tweet where the twitter_entities object has data present in the "media" field, and will include multiple photos where present in the post. Note that this is the correct location to retrieve media information for multi-photo posts.
"twitter_extended_entities": {
"media": [
{
"id": 5.0475663942184e+17,
"id_str": "504756639421837312",
"indices": [
64,
86
],
"media_url": "http:\/\/pbs.twimg.com\/media\/BwFBfT7CMAA-Jct.jpg",
"media_url_https": "https:\/\/pbs.twimg.com\/media\/BwFBfT7CMAA-Jct.jpg",
"url": "http:\/\/t.co\/VPXUMLqPKI",
"display_url": "pic.twitter.com\/VPXUMLqPKI",
"expanded_url": "http:\/\/twitter.com\/RedSox\/status\/504756640550514688\/photo\/1",
"type": "photo",
"sizes": {
"medium": {
"w": 600,
"h": 800,
"resize": "fit"
},
"large": {
"w": 645,
"h": 860,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"small": {
"w": 340,
"h": 453,
"resize": "fit"
}
}
},
{
"id": 5.0475663942185e+17,
"id_str": "504756639421853696",
"indices": [
64,
86
],
"media_url": "http:\/\/pbs.twimg.com\/media\/BwFBfT7CcAAYBhR.jpg",
"media_url_https": "https:\/\/pbs.twimg.com\/media\/BwFBfT7CcAAYBhR.jpg",
"url": "http:\/\/t.co\/VPXUMLqPKI",
"display_url": "pic.twitter.com\/VPXUMLqPKI",
"expanded_url": "http:\/\/twitter.com\/RedSox\/status\/504756640550514688\/photo\/1",
"type": "photo",
"sizes": {
"large": {
"w": 773,
"h": 579,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 449,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"small": {
"w": 340,
"h": 254,
"resize": "fit"
}
}
}
]
}
|
||||||||||||||||||||||||||||||||||||||
|
link |
A Permalink for the tweet.
"link": "http:\/\/twitter.com\/KidCodo\/statuses\/347769243409977344"
|
||||||||||||||||||||||||||||||||||||||
|
body |
The tweet text.
"body": "With Cardiff, Crystal Palace, and Hull City joining the EPL from the Championship it will be a great relegation battle at the end."
|
||||||||||||||||||||||||||||||||||||||
|
objectType |
"activity"
"objectType": "activity"
|
||||||||||||||||||||||||||||||||||||||
|
object |
An object representing tweet being posted or shared.
"object":
{
"objectType": "note",
"id": "object:search.twitter.com,2005:347769243409977344",
"summary": "With Cardiff, Crystal Palace, and Hull City joining the EPL from the Championship it will be a great relegation battle at the end.",
"link": "http:\/\/twitter.com\/KidCodo\/statuses\/347769243409977344",
"postedTime": "2013-06-20T17:33:43.000Z"
}
|
||||||||||||||||||||||||||||||||||||||
|
postedTime |
The time the action occurred, e.g. the time the Tweet was posted.
"postedTime": "2013-06-25T17:12:52.000Z"
|
Compliance Activities
Status Delete Activities
Status Delete Activities occur when a Twitter user deletes a Tweet. The following table describes the format of the root-level object of these activities.
|
objectType |
"activity"
"objectType":"activity"
|
|
verb |
"delete"
"verb":"delete"
|
|
object.id |
An object containing the ID of the Tweet being deleted.
"object": {
"id":"tag:search.twitter.com,2005:361114985893605376"
}
|
|
actor.id |
The ID of the Twitter user deleting their Tweet.
"actor": {
"id":"id:twitter.com:593567038"
}
|
|
timestampMs |
The timestamp of the Delete activity, with millisecond granularity.
"timestampMs":"2014-08-27T23:49:41.735+00:00"
|
User Delete Activities
User Delete Activities occur when a Twitter user’s entire account and associated Tweets are deleted. The following table describes the format of the root-level object of these activities.
|
verb |
"user_delete". Occurs where a user deletes their Twitter account.
"verb":"user_delete"
|
|
object.id |
An object containing the ID of user account being deleted
"object": {
"id":"tag:search.twitter.com,2012:user/1602954684"
}
|
|
timestampMs |
The timestamp of the User Delete activity.
"timestampMs":"2014-08-27T23:49:40.532+00:00"
|
User Undelete Activities
User Undelete Activities occur when a Twitter user’s account is undeleted. The following table describes the format of the root-level object of these activities.
|
verb |
"user_undelete". This occurs when a Twitter user "un-deletes" their account.
"verb":"user_undelete"
|
|
object.id |
An object containing the ID of user account being un-deleted.
"object": {
"id":"tag:search.twitter.com,2012:user/930913333"
}
|
|
timestampMs |
The timestamp of the user un-delete activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|
User Protect Activities
User Protect Activities occur when a Twitter user’s set their account as protected. Accounts with protected Tweets require manual approval of each and every person who may view that account’s Tweets. The following table describes the format of the root-level object of these activities.
|
verb |
"user_protect". This occurs when a Twitter user switches their account from public to "protected".
"verb":"user_protect"
|
|
object.id |
An object containing the ID of user account being protected.
"object": {
"id":"tag:search.twitter.com,2012:user/930913333"
}
|
|
timestampMs |
The timestamp of the user protect activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|
User Unprotect Activities
User Unprotect Activities occur when a Twitter user’s set their account as unprotected. Unprotected accounts are public and is the default setting. The following table describes the format of the root-level object of these activities.
|
verb |
"user_unprotect". This occurs when a Twitter user switches their account from "protected" to public.
"verb":"user_unprotect"
|
|
object.id |
An object containing the ID of user account being unprotected.
"object": {
"id":"tag:search.twitter.com,2012:user/930913333"
}
|
|
timestampMs |
The timestamp of the user unprotect activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|
Scrub Geo Activities
Scrub Geo Activities occur when a Twitter user removes the geo-location data from their previous Tweets. The following table describes the format of the root-level object of these activities.
|
objectType |
"activity"
"objectType":"activity"
|
|
verb |
"scrub_geo". This occurs when a Twitter user removes Geo location information from past Tweets.
"verb":"scrub_geo"
|
|
actor.id |
An object containing the ID of the Twitter User removing Geo location information from their Tweets.
"actor": {
"id":"id:twitter.com:598851423"
}
|
|
target.up_to_id |
Receiving this indicates that you should remove the geo data from the user's Tweets, up to the up_to_id value.
"target": {
"up_to_id":"tag:search.twitter.com,2005:503024492839727104"
}
|
|
timestampMs |
The timestamp of the activity -- the time at which the user removed their Geo data from past Tweets.
"timestampMs":"2014-08-27T23:51:13.651+00:00"
|
User Suspend Activities
User Suspend Activities occur when a Twitter user has violated Twitter Rules, or an account is suspected of being hacked or compromised. The following table describes the format of the root-level object of these activities, in Twitter’s format (not Activity Stream format).
|
verb |
"user_suspend". This occurs when a Twitter user is suspended.
"verb":"user_suspend"
|
|
object.id |
An object containing the ID of user account being suspended.
"object": {
"id":"tag:search.twitter.com,2012:user/930913333"
}
|
|
timestampMs |
The timestamp of the user suspend activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|
User Unsuspend Activities
User Unsuspend Activities occur when a Twitter user’s account has been unsuspended. The following table describes the format of the root-level object of these activities.
|
verb |
"user_unsuspend". This occurs when a Twitter user is unsuspended.
"verb":"user_unsuspend"
|
|
object.id |
An object containing the ID of user account being unsuspended.
"object": {
"id":"tag:search.twitter.com,2012:user/930913333"
}
|
|
timestampMs |
The timestamp of the user unsuspend activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|
User Withheld Activities
User Withheld Activities occur when a Twitter user’s account has been withheld in specified areas of the world. The following table describes the format of the root-level object of these activities.
|
verb |
"user_withheld". This occurs when a user has been withheld.
"verb": "user_withheld"
|
|
object.id |
The ID of user account being withheld.
"object": {
"id": "tag:search.twitter.com,2012:user/1375036644"
}
|
|
object.withheld_in_countries |
A JSON array of ISO-3166-1 alpha-2 country codes, representing the countries in which the user is withheld.
"object": {
"withheld_in_countries": ["XY"]
}
|
|
timestampMs |
The timestamp of the activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|
Status Withheld Activities
Status Withheld Activities occur when a user’s Tweet had been withheld. The following table describes the format of the root-level object of these activities.
|
status_withheld object |
"status_withheld". Occurs when Twitter withholds a Tweet from specific countries.
"verb": "status_withheld"
|
|
object.id |
The ID of the Tweet being withheld.
"object": {
"id": "tag:search.twitter.com,2005:425794620677554176"
}
|
|
object.user_id |
The ID of the user whose Tweet is being withheld.
"object": {
"user_id": "tag:search.twitter.com,2012:user/1375036644"
}
|
|
object.withheld_in_countries |
Contains a JSON array of ISO-3166-1 alpha-2 country codes. When a Tweet is withheld in all countries, a 'XX' country code will be used. When a Tweet is withheld due to a Digital Media Copyright (DMCA) complaint a 'XY' country code will be used and an additional 'withheld_copyright' attribute will be set to true.
"object": {
"withheld_in_countries": ["XY"]
}
|
|
timestampMs |
The timestamp of the activity.
"timestampMs":"2014-08-27T23:49:41.839+00:00"
|