On September 5 2012, Lil Jojo, an 18 year old aspiring rapper out of Chicago, was gunned down moments after posting a tweet bragging about being in an opposing gang's neighbourhood. A few days later, someone claiming to be his girlfriend posted Facebook transcripts of Joseph Coleman anticipating his own death.

Newly signed Chicago artist Chief Keef posted a tweet laughing at the death of Joseph Coleman hours after the incident.

The lyrics to the video above repeats the phrase '3hunnak' and 'BDK'. '3hunnak' is a reference to Chief Keef's gang 300, a subset of Black Disciples (BD) - The '-k' suffix denoting 'killer'. BDK is read as a 'killer of black disciples'.

Rap beef that crosses over to real life violence isn't new, but, the JoJo-Chief Keef story, in particular, allows observers to digitially archive violence between teenagers that would otherwise not get headlines.

To go a step further would be to see how people who are interacting with the content behave. Browsing through the comments on YouTube of '3hunnak' and filtering out comments that explicitly mention 'bdk' gives observers a view of this relatively closed-off world.

Both these comments seem to refer to'69' and 'bdk', relatively unique terms. It seems logical that the number 69 could refer to a street in the context of gang shoutouts. It seem that these two terms {bdk, 69} go together - perhaps a street reference with a gang reference.

It makes sense in light of the fact that most of the Chicago rappers gain viewership through 'grassroots methods'. Empirically speaking, this means that the people in the comments section 'write similarly to how the rappers they are watching talk' - in this case, street and gang references unique to the Chicago area. We know that 'BD' refers to 'Black Disciples', which is located in the Chicago area.

If these users are writing about the gang stuffs, these videos present an interesting opportunity to mine data, correlate gang terms in the city, and find out, geographically, where these 'sets' are.


To do this, lots of relevant data must be gathered. An easy way to do this is through a simple graphing-out of songs. For each song, YouTube suggests similar songs. So, if we know that a song has a high ratio of gang shoutouts {bdk, 69...}, we can add that video's comments to a database.

After this is done, we need to 'graph out' to find new terms. So, for example, the machine can go from comments mentioning {bdk} and find that it correlates with the term {69}. Iterating from there, find what new terms correlate with {bdk} and {69}. So, the machine can go through a process like this to learn new terms by filtering out relevant comments:

This comment refers to both '69' and 'bdk'. It can be inferred from this comment that '69' is in fact a reference to a street. More specifically, we know that 69th and Parnell correlates with 'bdk'. Also, {GDN} and {BDK} correlate. One is a positive relationship (GD) and one a negative relationship (BD), which makes sense since {BDN} and {GDN} have a long, historical beef tracing back to the 1980s.

Data in this form is relatively simple to mine. With enough data and some supervised machine-learning algorithms (supervised because of how messy some of the affiliation writing can be), we can start to map out correlates. It won't always be in as clean a format of {represent X, oppose Y, street Z}, but with enough data and smart algorithms, gang relationships can be correlated positively and negatively, and geographic locations can be discovered.

Issues like whether YouTube users are even representative of anything have to be thought about seriously, as do the ethical issues surrounding the monitoring of artists/users with this kind of methodology. A smart enough algorithm could be able to determine exactly where comments were written from and create a database that way.

Back to 'bdk'. Going through the comments database using a 'sophisticated algorithm' to find explicit street references, here is a partial list of streets that mention 'bdk':

ebk loc shyt 109th ls bdk so jojo
bdk come at me nigga. 069 . 69th and lowe we out here
da first comment u said fuck jojo dats why bds shot em now u talm bout yall was best friends.lmaoo yo ass goofy u aint makin no noise bitch i can get ya whole fam wacked on bdk bitch come on 75th its crackin
057th normal row row boys we wit it!!!!!!!!!! bdk
bdk wiiicity u hear 87th east side crazy lil reese a hoe on me i went to school on
dis shyt go hard 111th vernon rockin wit it gdn bdk upt crazy
bdk wild 100's..115th perry..gdn bitch

And so on. Other terms can be gathered and other streets can be found. From here, we start to enter weirder methodologies. For example, sometimes geographic data is given without an explicit gang shoutout:

48th indiana

This comment by itself says nothing about '48th and Indiana'. However, because the database has graphed out far enough into the Chicago YouTube rap ecosystem, we can search through comments by the same user and find this:


Which points {48th & Indiana} to being a {GD} block. It's a technique to gather data although perhaps it comes at the expense of profiling users. The argument can be made that writing things on the internet automatically leaves the user susceptible to data mining. Perhaps it is an unfair contract, though.

What these videos do is allow the kids posting to have a dialogue (an unproductive one if the context for productive dialogue here is less overall animosity) with other users in a relatively safe medium (the internet). Users like the one above can actively find a 'community' of like-minded thinkers on the video comments section.

Using the idea that most of these songs are more heavily 'liked', we can assume that the majority of each video's commenter-base agrees with the artists in terms of gang association. Using that logic, these videos are, it can be argued, a way for these kids to find like-minded songs about things they think about.

In a way, it might be new territory. The JoJo story is a narrative detractors can point to as evidence that these 'virtual beefs' (or more generally, 'sonic beefs') can lead to real-life violence. The implication is that the 'rise of drill music' can lead to more violence in Chicago. In a way, they may be right. But probably not. A link between homicide rise and music would be tough to make and almost impossible to prove. As a counterpoint, Chicago's murder total in the past 5 years or so is on average about half of what it was in the early ‘90s. But none of that is really interesting.


The next technique is to search through comments that have 'bdk' and find other relevant terms that appear with it. Here is a partial list with 'scores':

bdk: 1454
bricksquad: 41
insane: 32
fbg: 26
tookaville: 17

Then we can map out, geographically, where {Bricksquad, insane, fbg, tookaville} relate to and what terms they relate to.

69th and eggleston
51st st, Englewood
57th st
70th st
68th st
69th st and California
69th and King dr
75th st
79th st and Drexel
63rd st
62nd st


Smart math people have developed some good algorithms to separate and classify terms. We can use these to figure out the sub-gangs and sub-terms associated with the BD's and

brick city
wic city
joc city
mac creek
dro city
Welch world
Jaro city
young money
roc creek
low end
killa ward

The problem with gathering good data here is that the comments cannot always be parsed cleanly. Take for example this

nigga fuck 39 and welchworld from 40th langley da projects and 43 cottage but dat nigga choppa a bitch he aint welchworld and its crackin for 39 lakepark

It becomes a little more complicated than collecting terms and outputting associations, especially geographic ones. Is the user saying 'fuck 39 AND fuck welch world' or 'fuck 39' + 'welch world ranges from 40th and Langley to 43rd and Cottage'? Not to mention that the phrase 'it's cracking' itself is mysterious too. It might not be as simple of an interpretation as 'fuck 39th and Lakepark', although that is what I am assuming is being said.

And the deeper this world is explored the more coded the terminology gets.

7-4-14 we out chea folkz!!!!!!!!#bdk

Where 7-4-14 seems like a mysterious term unlikely to be a reference to a street. Luckily though, with enough data, things sometimes explain themselves. A simple search of 'stands for' and '7-4-14' returns this helpful comment from the database:

7-4-14 stands for gdn...g is the 7th letter in the alphabet, d is the 4th and n is the 14th

And so on. The lack of trafficking references in these songs is odd, though. The gang stuff in this context can be interpreted as a community phenomenon and not necessarily a calculated fiscal decision.

The goal will be to get this machine to be self-sustaining; to be able to venture out and find new videos with relevant comments and make the relationships and the geographic locations tighter and better.