Parsing Twitch chat to build a bot
- Friday July 12 2019
- http websockets web
I was recently pitched the idea of creating a Twitch Chat bot to run a scripts when viewers take a specific action. Twitch Chat is basically just the IRC procotol over WebSockets, with lots of custom extensions for structured data. The final implementation was simple, but there are lots of lessons I learned along the way. This article presents all of that knowledge so you don't have to learn.
Establishing the connection to Twitch Chat API
The first step is to get connected to Twitch chat and make sure you can receive all the data you are interested in.
Websocket connection
To connect to Twitch chat you'll need a websocket client. There is a pretty extensive list of available clients here. Once you have a client you'll connect to the URL wss://irc-ws.chat.twitch.tv:443
. The URL never changes no matter what chat you want to receive messages from.
Once you're connected with the WebSocket messages arrive at a time.
Capabilities Request
The next step is to send a Capabilities Request. This informs twitch what you want to receive. To request everything, send a message like this
CAP REQ :twitch.tv/tags twitch.tv/commands twitch.tv/membership
This tells Twitch that you want all capbilities. The server sends a message like this acknowledging your request
:tmi.twitch.tv CAP * ACK :twitch.tv/tags twitch.tv/commands twitch.tv/membership
Authentication
To authenticate with Twitch, you must send a token that uniquely identifies you. This is done by sending a message like this
PASS oauth:sdalkfjasdasdfaf
Where you would send your actual token instead. To simply generate a token, go to this page and click "Connect". The generated password on the next page is your token, including the part starting with "oauth:". This is the fastest way to get connected to the Twitch Chat API.
If have an existing application that you want to expand its authorization you need to add the following scopes to your OAuth token request.
- chat:edit
- chat:read
- whispers:read
- whispers:edit
These scopes are from the bottom of the page here.
After PASS
is sent your connection is authenticated. At this time you should send a NICK
message like this
NICK myusername
Where myusername
is replaced with whatever you want the bot to appear in your chat as.
If you've successfully authenticated the server will reply with messages like this
:tmi.twitch.tv 001 thewolfpack :Welcome, GLHF! :tmi.twitch.tv 002 thewolfpack :Your host is tmi.twitch.tv :tmi.twitch.tv 003 thewolfpack :This server is rather new :tmi.twitch.tv 004 thewolfpack :-
Join channels
At this point you are ready to join one or more channels. This is done by just sending a message
JOIN #channel
Where #channel
is replaced with the name of the stream chat you want to join. For example, to join a stream chat that is at twitch.tv/bob
you would send the command JOIN #bob
. You can join more than one channel if that is desired. Keep in mind the streamer doesn't actually need to be streaming to join their chat. If there is a limit to the number of channels, I have not been able to find it.
At this point you are actually receiving messages from the Twitch stream chat.
Chat messages
As indicated before, the Twitch Chat is basically just IRC with lots of extensions.
There are two types of messages you'll recieve
- User notices
- "Private messages"
- Host targets. The first part of the message always starts with
:tmi.twitch.tv
User notices are used to deliver lots of information about the users currently in the chat. I have "Private messages" in quotes for reasons I'll explain below. Host targets inform you when one channel hosts another.
Parsing User Notices
Each user notice arrives formatted like this
@badge-info=subscriber/8;badges=subscriber/6,bits/75000;color=#1E90FF;display-name=Ovojaytee;emotes=1837404:44-50/915234:164-169/1093027:13-18;flags=;id=aa52e1d2-6ff5-42ba-b205-9d4a15f9dbf8;login=ovojaytee;mod=0;msg-id=resub;msg-param-cumulative-months=7;msg-param-months=0;msg-param-should-share-streak=1;msg-param-streak-months=8;msg-param-sub-plan-name=Channel\sSubscription\s(loeya);msg-param-sub-plan=1000;room-id=166279350;subscriber=1;system-msg=Ovojaytee\ssubscribed\sat\sTier\s1.\sThey've\ssubscribed\sfor\s8\smonths,\scurrently\son\sa\s8\smonth\sstreak!;tmi-sent-ts=1558352544376;user-id=160605648;user-type= :tmi.twitch.tv USERNOTICE #loeya :Wow 8 months loeyaH our baby is almost here loeyaHM can we name him Zlatan ? Thanks Queen for always starting off my day on a good note with your wonderful content loeya1
Yes, all of what you see above is in a single message. This example event corresponds to a user resubscribing to the streamers channel on Twitch. Here is the same thing represented as symbols
<tags> USERNOTICE #<channel> <chatmsg>
The <tags>
are Twitch's way of sending lots of structured data as an IRC message. The first character in the message is always a the ampersand @
. The end of the tags is always the first space in the message. Everything in between is the tags. To parse the tags into structured data you need to do the following.
- Take everything from after the ampersand to just before the first space as the tag string.
- Split the tag string into an array by the semicolon
;
- For each element in the array, check to see if the element has an equals sign
=
- If the element has an equals sign, split the element on the equals sign. This gives you tag name and value.
- If the element does not have an equals sign then you just have tag name.
At this point you have a series of name & value pairs. I recommend you put these into a hash type datastructure. The keys can be literally anything and Twitch constantly adds more.
The USERNOTICE
is always exactly that, just a string indicating this as a user notice.
The value #<channel>
is one of the channels you subscribed to with the JOIN
command earlier.
The <chatmsg>
is a string representing the unformatted message sent by the the user.
User notice messages
As you can probably guess, lots of things show up as a user notice. The way to tell each one apart is to check for a tag named msg-id
after you've parsed it. The following table captures the values for msg-id
that show up. For each message you also receive a huge amount of other tags. I've listed out the additional tags that I found useful.
msg-id | Description | Additional tags |
---|---|---|
sub | User subscribes | display-name,login,msg-param-sub-plan,msg-param-sub-plan-name |
resub | User resubscribes | display-name,login,msg-param-cumulative-months,msg-param-sub-plan,msg-param-sub-plan-name |
subgift | User gifts subscription | display-name,login,msg-param-months,msg-param-sub-plan,msg-param-sub-plan-name,msg-param-recipient-display-name,msg-param-recipient-user-name |
primepaidupgrade | User upgrades from free Prime Subscription | display-name,login,msg-param-months,msg-param-sub-plan,msg-param-sub-plan-name,msg-param-recipient-display-name,msg-param-recipient-user-name |
giftpaidupgrade | User gifts subscription upgrade | display-name,login,msg-param-months,msg-param-sub-plan,msg-param-sub-plan-name,msg-param-recipient-display-name,msg-param-recipient-user-name |
raid | User raids this channel | display-name,login,msg-param-months,msg-param-sub-plan,msg-param-sub-plan-name,msg-param-recipient-display-name,msg-param-recipient-user-name |
"Private messages"
The other type of message you'll receive looks like this
@badge-info=subscriber/1;badges=subscriber/0,sub-gifter/1;bits=2000;color=;display-name=jake_the_nice_n_lord;emotes=;flags=;id=d12c3183-b226-4d58-b149-29bb77834af1;mod=0;room-id=166279350;subscriber=1;tmi-sent-ts=1558353420319;turbo=0;user-id=435647128;user-type= :jake_the_nice_n_lord!jake_the_nice_n_lord@jake_the_nice_n_lord.tmi.twitch.tv PRIVMSG #loeya :cheer2000 thanks for the entertaining streams. enjoy!
This specific message was sent when a user donated bits to the streamer. Represented as symbols, this message is formatted as follows
<tags> PRIVMSG <channel> <chat message>
These symbols are the same as those in the USERNOTICE
I described above. Historically the PRIVMSG
did have connotations of a private message, but this is a public Twitch stream. So you can pretty much ignore the name of this message. Unlike the above messages, identifying the message type is not so easy. The vast majority of these messages are of course just regular user chat messages and can be ignored. The two types I have identified as interesting are described below.
User bit donations
If a user donates bits, you see a message with a tag of bits
set to a numerical value. You can also check the tag display-name
on the same message to find out who donated. The chat message is whatever message the user entered when they donated bits.
Hosts
If you are hosted by another, you usually get a simple message like this
:jtv!jtv@jtv.tmi.twitch.tv PRIVMSG theadventuresofcoyote :CheeseLordTheCommunist is now hosting you.
This is actually one of the shortest messages you'll receive. It has no tags to identify it whatsoever. The first part fo the message is always :jtv!jtv@jtv.tmi.twitch.tv
as shown above. There is one important thing to notice in the above message. The message is formatted as PRIVMSG username
There is no hash mark #
after the PRIVMSG
. This means that this is a message that is sent directly to user. It's like a whisper. What thise means is that while you can connect to anyones chat and monitor bit donations, subscriptions, etc. you need to be authenticated as the streamer to actually receive these host notifications. I don't know why this distinction is made as Twitch host is fairly public knowledge, but that is how it works.
The above message shown has the text "is now hosting you" at the end of it. This is what usually is sent across with no indication of the number of viewers. Sometimes you do also get messages like this as well
:jtv!jtv@jtv.tmi.twitch.tv PRIVMSG theadventuresofcoyote :Medallion95 is now hosting you for up to 3 viewers.
In this example, the text contains a numerical indication of the number of viewers. I wanted one piece of code to check for both types, so I did the following
- Check that message is a
PRIVMSG
- Check that the message starts with
:jtv!jtv@jtv.tmi.twitch.tv
- Check that the message has the text "now hosting you"
- Grab everything from the start of the chat message to the first space. This is the user that hosted you
- Run a regular expression checking for "up to (\d+) viewers" or similar in whatever language you prefer
- If the regular expression matches, pull out the match group as the viewer count
Host targets
The last type of message you'll see looks like this
:tmi.twitch.tv HOSTTARGET #theadventuresofcoyote :infantryman4life 16
The message is exactly what it looks like. If you are in a channel when that streamer hosts someone else, you will receive this message. The structure is as follows
:tmi.twitch.tv HOSTTARGET #<channel-hosting> <channel-hosted> <viewer-count>
Twitch Follows
But what about Twitch Follows? What chat messages inform you about those? Well the answer is simple: none. At the time of my investigation there was no way to receive notifications in your application that a user has received a new follower.
But of course there is a work around for this. The Twitch API allows you to get all the user followers by making an HTTP request. This is fully documented here. The URL is https://api.twitch.tv/kraken/users/<user ID>/follows/channels
. The value <user ID>
is the numeric ID of the user, not the publicly visible name of the streamer.
To get the User ID, there are a couple ways
- Make a request to
https://api.twitch.tv/kraken/user'
to get a result with your own user ID - If you want to search for users, you can make a request to
https://api.twitch.tv/kraken/users?login=<user IDs>
For both APIs you need to have registered with the Twitch API so they know what application is making the request. The steps for that are
- Register your application with Twitch
- Get a User Access Token with scope
user_read
Each HTTP request you make must include the access token to prove your identity. This involves using OAuth, so you may want to use a library to help you implement this. The easiest way to get this is to through the OAuth Implicit Code Flow. If you aren't familiar with this process, then I suggest you read this article from DigitalOcean.
With all that taken care of we can now check for new follows periodically. In order to detect new follows, I did the following in a loop.
- Request the follows, with parameters of
offset=0
,limit=10
,sortby=created_at
&direction=desc
- Find follows in the result that are not in the results from the previous iteration
- These follows are the new follows for this user
- Store all the follows as the previous result
- Sleep for 61 seconds.
- Go back to step 1
This works because the API always return the 10 most recent follows. So the difference between the follows from the current result and the follows from the previous result are the new ones. This obviously isn't instant, since this a polling operation. The polling rate of once every 61 seconds was based off something I read by a Twitch employee that suggested only polling once a minute would not result in rate limiting. This seemed to work for me. You'll receive an HTTP response code of 429
if you get rate limited.
This does of course assume that a user does not receive more than 10 follows in 61 seconds. If that might happen then increase the limit
request parameter appropriately.
There is in fact a webhook API for "User Follows". It is documented here. The documentation for this seems to indicate that this can proactively notify when a user receives a new follow. The problem is, you need to be running a web server to make this work. It has to be publicly accessible, because Twitch makes a HTTP POST request over the internet when the user gets a new follow. In my circumstance the application I was building was a simple desktop-based application so running a public facing web server was not an option.