r/boston's favorite dishes, 2023

Kenny Chou · July 30, 2023

Users of Reddit r/Boston responded to the question “What is the single best restaurant menu item in Boston?” In this post, I compiled their responses and mapped out where you can find those dishes around town. Only top-level responses are considered in the table below. The number of upvotes from repeated responses are added together into a single entry. Responses like “water” are excluded from the dataset. Responses with less than 8 upvotes are also excluded.

Hover over a circle to see the dish, restaurant, and number of votes!

Methodology

In liu of using reddit’s API, I copied the entire HTML from this reddit thread and parsed it with BeautifulSoup. After some manual cleaning, I used plotly and folium to visualize the most-upvoted dishes from the thread. Irrelevent responses (e.g., water) were removed from the dataset. At the end of this process, a dataframe df was created with restaurant, dishes, and votes.

Geographical coordinates for each restaurant were the obtained via the Yelp API. The dataframe was additionally populated with columns yelp name, longitude, and latitude.

Map upvotes to color hex

# use np.log() to transform the vote distributions to be slightly more uniform
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors

norm = mcolors.Normalize(vmin=np.log(10), vmax=np.log(500), clip=True)
mapper = plt.cm.ScalarMappable(norm=norm, cmap=plt.cm.plasma)
df['color_hex'] = np.log(df['votes']).apply(lambda x: mcolors.to_hex(mapper.to_rgba(x)))

Creating & formatting the data table

df_to_style = df[['yelp name','dishes','city','votes']].head(60).reset_index(drop=True)
df_to_style = df_to_style.rename(columns={
    'yelp name':'Restaurant',
    'dishes':'Dish',
    'city':'City',
    'votes':'# Upvotes',
})

# Define a function to apply cell styling based on hex_code column
def apply_style(row):
    hex_code = np.log(row).apply(lambda x: mcolors.to_hex(mapper.to_rgba(x)))
    if row.iloc[0] > 100:
        return [f'background-color:{hex_code.iloc[0]}; color: #000000']
    else:
        return [f'background-color:{hex_code.iloc[0]}; color: #f1efff']        

# Apply styling to the DataFrame
styled_df = df_to_style.style.apply(apply_style, axis=1, subset=['# Upvotes']).hide(axis="index")
styled_df.to_html("reddit_fav_dishes_Boston_2023_table.html")

Create map marker icons

Folium’s defult marker options are circle markers or google-map like icons. Which has a default blue color and isn’t customizable.

I want to fill each marker with a color that matches the histogram above. To do this, I first build a customized marker in Adobe Illustrator (you can copy one from the internet, and trace the image to vectorize it). Exporting the SVG and opening it in notepad reveals a browser-interpretable SVG code that can be used in conjunction with HTML code.

<!--Copy and paste this code into a .html file and see what you get-->
<?xml version="1.0" encoding="UTF-8"?>
<svg id="Layer_2" data-name="Layer 2" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 60.78 96.23">
  <defs>
    <style>
      .cls-1 {
        fill: #a5a5a5;
      }
    </style>
  </defs>
  <g id="Layer_1-2" data-name="Layer 1">
    <path class="cls-1" d="m30.34,96.23c-.74-3.99-1.29-8.03-2.24-11.97-2.97-12.33-8.72-22.98-18.29-31.6C-2.73,41.35-3.26,21.96,8.27,9.62c11.5-12.31,30.67-12.88,42.94-1.27,12.04,11.39,13.06,31.28,1.12,42.74-13.04,12.52-19.17,27.85-21.99,45.15Zm.13-88.07c-12.3-.05-22.26,9.76-22.35,22-.09,12.37,9.99,22.53,22.3,22.5,12.19-.03,22.16-10.03,22.17-22.25.02-12.22-9.92-22.21-22.12-22.26Z"/>
  </g>
</svg>

Notice it has a default style with fill color fill: #a5a5a5, and the graphic itelf has the class .cls-1. I’ll modify these properties to assign different fill colors according to each data point’s color_hex value, as defined earlier.

Unfortunately, I wasn’t able to dynamically inject fill color properties when calling folium.Marker(), so I need to first create a CSS class for each marker color with different fill colors, then assign those classes to each corresponding data point using folium.DivIcon(class_name=my_class_name).

# for each color, create an associated css class
class_arr = "."+df['color_hex'].str.replace("#","c-").unique()+" {fill: " + df['color_hex'].unique() + ";}"
marker_class_style = " ".join(class_arr)

# class_arr
# >>> '.c-fdb52e {fill: #fdb52e;}', '.c-ca457a {fill: #ca457a;}' ... etc

The modified svg code looks like this:

# svg code
div_icon =f"""
    <svg id="Layer_2" data-name="Layer 2" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100">
    <defs>
        <style>
        {marker_class_style}
        </style>
    </defs>
    <g id="Layer_1-2" data-name="Layer 1">
        <path d="m30.34,96.23c-.74-3.99-1.29-8.03-2.24-11.97-2.97-12.33-8.72-22.98-18.29-31.6C-2.73,41.35-3.26,21.96,8.27,9.62c11.5-12.31,30.67-12.88,42.94-1.27,12.04,11.39,13.06,31.28,1.12,42.74-13.04,12.52-19.17,27.85-21.99,45.15Zm.13-88.07c-12.3-.05-22.26,9.76-22.35,22-.09,12.37,9.99,22.53,22.3,22.5,12.19-.03,22.16-10.03,22.17-22.25.02-12.22-9.92-22.21-22.12-22.26Z"/>
    </g>
    </svg>
    """

Display data on a map

import folium
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="example app")
# find Boston on the map
boston = geolocator.geocode("Boston")
boston_loc = (boston.latitude, boston.longitude)

# Create a map object and center it to the avarage coordinates to m
m = folium.Map(
    location=boston_loc, 
    zoom_start=13,
    tiles='cartodbpositron')

# add each business object to the map
for i,r in df.iloc[:50].iterrows():
    if (r['latitude'] != 0) & ~pd.isna(r['latitude']):
        location = (r["latitude"], r["longitude"])
        icon_class_name = r['color_hex'].replace("#","c-")
        folium.Marker(
            location=location,
            tooltip=f"{r['yelp name'].title()}:<br>{r['dishes'].capitalize()},\n{r['votes']} votes",
            icon=folium.DivIcon(
                html=div_icon,
                icon_size=30,
                class_name=icon_class_name,
                ),
            popup=f"<div style=\"width: 150px;\">{r['yelp name'].title()}:<br>{r['dishes'].capitalize()}<br>{r['votes']} votes</div>"
            ).add_to(m)

# to visualize the map, call `m` in Jupyuter Notebook

# export map as html
m.save("best_dishes_boston_2023_map.html")

Data Source

r/Boston

Further Reading