Analyzing your iMessage group chat with OpenAI’s API

My college friends and I have an iMessage group chat that has been active for 10 years or so. I’ve seen how good GPT is at analyzing text and wanted to apply it to our conversations. It was easier than I thought it would be, and you can follow my process at the end of this article.

5 years ago I analyzed my facebook messages, plotting the frequency of my exchanges with different friends over time. It is remarkable to see how much more powerful it is to analyze similar data using OpenAI’s APIs and how much easier it is. You can feed in messages and ask questions like “how would you describe this person’s personality?” or “what are some of the funniest things we talked about?” and get reasonably good answers back.

I fed in a recent block of about 200 messages, a sample of which is below.

Then I prompted the API with lots of questions analyzing the conversation. For example, “What are the top 5 topics of this group chat?” produced the following output. Pretty spot on!

It’s fun to get an “outsider’s” perspective on you and your friends and the way you relate to each other. A lot of the analysis is too general to be compelling, like a horoscope. But sometimes it hits on things that you know are true. Here are a few prompts about how I come across.

How would you describe Joe?

Joe is a witty and sarcastic individual with a dry sense of humor. He often adds his unique perspective to conversations and enjoys making clever and sometimes cheeky remarks. While he may come across as a bit sarcastic or aloof at times, he is generally liked and known for his quick wit and laid-back demeanor.

Perhaps accurate, but it’s too nice, kinda boring. You have to ask to get it to be less positive.

Be critical. How would you describe Joe?

Joe is a somewhat elusive individual. He has a tendency to be vague and detached, often offering minimal information or lukewarm responses. It's difficult to get a sense of his true personality or interests based on his contributions. While he may have some knowledge and opinions on certain topics, he often appears disinterested or disengaged. Overall, Joe comes across as someone who doesn't fully invest himself in conversations or provide meaningful contributions.

Elusive! I’ll take it.

How you can analyze your own iMessage group chat with OpenAI’s API

The process was easy and a fun way to learn about the APIs OpenAI offers. OpenAI does charge for this, so you have to give them a credit card, but playing with it for a while only cost a few cents. At a high level, I:

Exported my iMessages on my Mac to text files
Read those text files into a Pandas DataFrame in a Jupyter notebook
Signed up for an OpenAI account to get an API key
Fed the text messages from my friends’ and my groupchat into the ChatCompletion API and asked it questions

The only part that wasn’t straightforward was getting my iMessages exported into a format that was easy to analyze, but there was a tool that made that easy too. I used the tool outlined here which has more detailed instructions, but you can also open a terminal and follow the commands I used to use this tool below (You need to be on a Mac for this to work):

# install rust
curl https://sh.rustup.rs-sSf | sh
# install imessage exporter tool
cargo install imessage-exporter
# export messages to txt files
imessage-exporter -f txt

Once the iMessages are exported, you can open a Jupyter notebook and read the text files into a Pandas DataFrame:

import os
import re
import pandas as pd
import openai
import time

# Define the path to the folder containing the text files
folder_path = "/PATH TO YOUR FOLDER"

# Create an empty list to store the contents of each text file
file_contents = []
file_names = []

# Loop through each file in the folder
for filename in os.listdir(folder_path):
    if filename.endswith(".txt"):
        # Open the file and read its contents
        with open(os.path.join(folder_path, filename), "r") as f:
            contents = f.read()
            # Append the contents to the list
            file_contents.append(contents)
            file_names.append(filename)

# Create a pandas DataFrame with each text file in its own row
df = pd.DataFrame({"Text": file_contents,
                  "Numbers": file_names})

# Print the DataFrame
print(df.head())

Now, each row of the DataFrame is an entire history of messages between you and one or more people. So, if you want to analyze a single conversation like I did, you need to find it. You can either search for a specific phone number to find the row in the DataFrame, using something like the following:

df.loc[df['Text'].str.contains('1234567891')]

Once you have found the row you wish to analyze, see what number in the index it is, and keep only that row:

friendgroup_chat = df.loc[[1643]]

Then there is a bunch of boring formatting to get it into a useable state:

# This splits the long conversation into individual texts, based on the date which for some reason the imessage tool exported improperly
friendgroup_chat['Text']=friendgroup_chat['Text'].apply(lambda x: x.split('\nDec 31, 2000  7:00:00 PM\n'))

# Explode it so that each row is an individual text
fg_chat=friendgroup_chat.explode('Text').drop(columns=['length', 'Numbers'])

fg_chat.reset_index(inplace=True)

fg_chat.drop(columns='index',inplace=True)

fg_chat['Text'].str.split('\n', n=1,expand=True)

fg_chat[['sender', 'text']]=fg_chat['Text'].str.split('\n', n=1,expand=True)

# name the people in your group chat 
name_dict = {'Me': 'Joe',
            '+1XXXXXX': 'Friend 1',
            '+1XXXXXX': 'Friend 2',
            '+1XXXXXX': 'Friend 3',
            '+1XXXXXX': 'Friend 4',
            '+1XXXXXX': 'Friend 5',
            }

fg_chat['sender_name']=fg_chat['sender'].apply(lambda x: name_dict[x])

fg_chat.drop(columns=['Text', 'sender'],inplace=True)

fg_chat['text']=fg_chat['text'].replace('\n',' ',regex=True)

# renaming things so that they fit well with the Open AI's API formatting
fg_chat.rename(columns={'text':'content', 'sender_name':'name'},inplace=True)

fg_chat['role']='user'

# take a sample of messages to analyze, you can change this to analyze different time periods of your chat
sample_messages = fg_chat.head.tail(220)

# i was lazy and just removed all the messages containing files, you could do this in a smarter way
sample_messages = sample_messages.loc[~sample_messages['content'].str.contains('jpeg'),]
sample_messages = sample_messages.loc[~sample_messages['content'].str.contains('PNG'),]
sample_messages = sample_messages.loc[~sample_messages['content'].str.contains('png'),]
sample_messages = sample_messages.loc[~sample_messages['content'].str.contains('JPG'),]
sample_messages = sample_messages.loc[~sample_messages['content'].str.contains('jpg'),]
sample_messages = sample_messages.loc[~sample_messages['content'].str.startswith('Loved'),]
sample_messages = sample_messages.loc[~sample_messages['content'].str.startswith('Laughed at'),]

Finally, you can call the API and ask it whatever you want!

openai.api_key ='YOUR API KEY'
openai.organization = "YOUR ORGANIZATION"

# read the data frame into a dictionary because that is what the API accepts
message_dict=sample_messages.to_dict(orient='records')
message_dict.append({'content':'Be critical. How would you describe Joe?',
'role':'system'}) # you can give instructions to the AI using the "system" role 

# call the ChatCompletion API, passing in the dictionary of your messages
completion = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=message_dict
)

# show the response!
print(completion.choices[0].message['content'])

Let me know if you have any questions, email me at joe @ this website or message me on twitter. Would love to hear how you end up using this! Also, if you figure out how to feed all of your messages into the API at one time, I’d love to hear about it; I only could do a relatively small sample at a given time.