Use ChatGPT with Caution!

Jul 10, 2023 | Artificial Intelligence

StateBook uncovers significant risk in relying on ChatGPT for research or analysis, particularly for professionals making investment decisions! 

When ChatGPT began gaining notoriety for its powerful generative AI capabilities, we got excited about the prospect of applying the novel technology to create reports for our customers. What we found was not just disappointing but actually alarming.

Most users of ChatGPT and other large language models do not have easy access to the comprehensive data that StateBook has aggregated for its subscribers. Excited to test ChatGPT’s capabilities, we put it through a series to tests. It failed miserably.

You can see the results of StateBook’s analysis here. Below are some other anecdotal examples of what we found. 

StateBook analyzed the below excerpt of a report that was published by a PropTech company on its website to demonstrate how ChatGPT could add value to a property report in seconds.

StateBook shared its analysis with the PropTech and they removed the report from their website, acknowledging that ChatGPT should not be used for research, particularly when potential real estate investment decisions worth tens or hundreds of millions of dollars hangs in the balance!

Note: The blue text below is ChatGPT’s responses to the questions posed by the PropTech company on their website. StateBook added the red text and data below to illustrate the correct data for each item so you can compare the correct information to what ChatGPT provided.

Comprehensive Executive Summary:

Question: Tell me about the area around 90-02 Queens Boulevard, Queens, New York 11373, United States

[Note: As a large language model, ChatGPT is able to deliver a compelling and articulate summary of the location requested. Not being familiar with the area, we do not know how accurate the below summary is.] 

90-02 Queens Boulevard is located in the Elmhurst neighborhood of Queens, New York. Elmhurst is a densely populated and ethnically diverse area with a mix of residential, commercial, and industrial properties.

The immediate area around 90-02 Queens Boulevard is primarily commercial, with a mix of retail stores, restaurants, and offices. The building itself is a six-story commercial building, and it is surrounded by other buildings of similar size and style.

One of the notable landmarks in the area is the Queens Center Mall, which is located just a few blocks away from 90-02 Queens Boulevard. The mall is one of the largest shopping centers in New York City, with over 150 stores and restaurants.

Elmhurst is also home to a number of parks and recreational areas, including Elmhurst Park, which is located just a few blocks south of 90-02 Queens Boulevard. The park features a playground, basketball court, and walking paths.

The neighborhood is well-connected by public transportation, with several bus and subway lines passing through the area. The Grand Avenue-Newtown subway station, which serves the M and R trains, is located just a few blocks east of 90-02 Queens Boulevard.

Overall, Elmhurst is a bustling and diverse neighborhood with a lot to offer, from shopping and dining to outdoor recreation and easy access to transportation.

Demographics:

Question: What are the demographics for Zip code 11373?

[Note: Every data point ChatGPT provided below was incorrect, in many cases egregiously so! Our comparison uses ACS 5-Year S0501. Decennial census and ACS 1-Year data not published.]

Zip code 11373 is located in Queens, New York and includes parts of the neighborhoods of Elmhurst, Jackson Heights, and Corona. According to the United States Census Bureau, as of the 2020 Census, the population of zip code 11373 was approximately 100,430 (102,104) people.

The demographics for zip code 11373 are diverse, with a mix of ethnic and racial groups represented. According to the 2020 Census, the racial and ethnic composition of the zip code’s population was:

  • Hispanic or Latino: 68.8% (41.8%)
  • Asian: 22.7% (49.7%)
  • White: 5.8% (15.7%, or white alone, no Hispanic 5.4%)
  • Black or African American: 1.3% (1.8%)
  • Two or more races: 1.3% (3.8%)
  • Other races: 0.1% (28.3%)

The median household income in zip code 11373 was $59,228 ($56,835), according to the 2020 Census. The poverty rate was 14.4% (8.8%), which is slightly higher than the national average. The educational attainment level in the zip code is relatively high, with 36.5% (27.8%) of residents holding a bachelor’s degree or higher.

Overall, zip code 11373 is a diverse area with a significant Hispanic and Asian population and a mix of income levels and educational backgrounds.

Industry Makeup

Question: What is the industry makeup for zip code 11373 with employee count and percentage employees?

Zip code 11373 in Queens, New York is home to a diverse range of industries and businesses. According to the United States Census Bureau’s County Business Patterns data for 2020, here is the breakdown of the industries in zip code 11373 by employee count and percentage of employees: (Employment data is actually not published for zip codes, so NONE of the below information is correct.)

  1. Health Care and Social Assistance: 5,399 employees (27.2%)
  2. Retail Trade: 3,975 employees (20.0%)
  3. Accommodation and Food Services: 3,372 employees (17.0%)
  4. Other Services (except Public Administration): 1,890 employees (9.5%)
  5. Professional, Scientific, and Technical Services: 1,677 employees (8.4%)
  6. Administrative and Support and Waste Management and Remediation Services: 1,626 employees (8.2%)
  7. Wholesale Trade: 1,026 employees (5.2%)
  8. Transportation and Warehousing: 849 employees (4.3%)
  9. Finance and Insurance: 331 employees (1.7%)
  10. Manufacturing: 155 employees (0.8%)

It’s important to note that these numbers are based on employer-provided data, and do not include self-employed individuals or those who work for businesses outside of the zip code. Additionally, the numbers may have been affected by the COVID-19 pandemic, which has had a significant impact on many industries in recent years. However, the data provides a general overview of the industry makeup for zip code 11373.

StateBook’s Chief Economist shared the following “correspondence” he had with ChatGPT with StateBook’s CEO. (Note: All ChatGPT responses are in blue.)

I thought you might find this chat I recently had with ChatGPT interesting.  Note how wrong ChatGPT is over and over and over again.  Get used to the term “I apologize, you are correct”.  I didn’t include the entire painful conversation.  My questions are in bold and ChatGPT’s responses are below.

What was the population of the Charlotte MSA in 2012 and 2013?

ChatGPT:

The population of the Charlotte Metropolitan Statistical Area (MSA) in 2012 was approximately 2,296,569, and in 2013 it was approximately 2,344,214. It’s important to note that population estimates can vary depending on the source and methodology used to calculate them.

What is the source of these estimates?

ChatGPT:

The population estimates for the Charlotte MSA in 2012 and 2013 were sourced from the United States Census Bureau’s American Community Survey (ACS). The ACS is an ongoing survey that collects data on a variety of social, economic, and housing characteristics of the U.S. population, including population estimates for metropolitan and micropolitan statistical areas. These estimates are based on a combination of survey data and administrative records and are updated annually to reflect changes in population.

Did you use the 1-year ACS or the 5-year?

ChatGPT:

I apologize for the confusion in my previous response. The 1-year American Community Survey (ACS) estimates are not available for Metropolitan Statistical Areas (MSAs) as they are typically based on a larger sample size and cover a longer period. The 1-year estimates are only available for areas with a population of 65,000 or more.

Therefore, to answer your question accurately, I need to clarify that the population estimates for the Charlotte MSA in 2012 and 2013 that I mentioned in my previous response were based on the 5-year American Community Survey (ACS) estimates, which provide more reliable and accurate population estimates for MSAs. The 5-year ACS estimates for the Charlotte MSA show that the population was approximately 2,296,569 in 2012 and approximately 2,344,214 in 2013.

Tell me the Charlotte MSA population using the 1-year ACS estimates.

ChatGPT:

I apologize for the confusion earlier, but I can provide you with the latest available 1-year ACS population estimate for the Charlotte MSA. According to the latest 1-year ACS data (2020), the population of the Charlotte-Concord-Gastonia, NC-SC Metropolitan Statistical Area was estimated to be approximately 2,701,796. It’s important to note that the 1-year ACS estimates are subject to a larger margin of error compared to the 5-year estimates due to the smaller sample size.

Do the same for 2012 and 2013.

ChatGPT:

I apologize for any confusion, but I need to clarify that the 1-year American Community Survey (ACS) estimates were not available for Metropolitan Statistical Areas (MSAs) for 2012 and 2013 as well.

The 1-year ACS estimates are only available for areas with a population of 65,000 or more, and MSAs typically have populations larger than that. Therefore, the 5-year ACS estimates are used for MSAs, which provide more reliable and accurate population estimates for these areas.

To answer your question accurately, the latest available 1-year ACS population estimate for the Charlotte MSA is for 2020, as mentioned in my previous response.

Your answer is incorrect. ACS 1-Year is released for all geographies meeting the population threshold of 65000 or more. Clearly Charlotte MSA meets that requirement. It is not based on Geography type.

ChatGPT:

You are correct, I apologize for the confusion in my previous responses. The 1-year American Community Survey (ACS) estimates are indeed available for all geographies meeting the population threshold of 65,000 or more, regardless of the type of geography.

Therefore, to answer your original question, the population of the Charlotte-Concord-Gastonia, NC-SC Metropolitan Statistical Area (MSA) in 2012 and 2013, based on the 1-year ACS estimates, were:

2012: approximately 2,296,569

2013: approximately 2,338,290

Again, I apologize for any confusion caused by my earlier responses.

Your answer is incorrect. According to data.census.gov the 2012 1-year ACS population for Charlotte MSA is 1,831,084.

Further, in 2012 it was NOT called Charlotte-Concord-Gastonia.

Given these are not official ACS statistics, what is the true source for these numbers?

ChatGPT:

I apologize for the confusion in my previous response. You are correct that the 1-year American Community Survey (ACS) estimates for the Charlotte MSA in 2012 were lower than my previous response indicated. The population estimate for the Charlotte-Concord-Gastonia, NC-SC MSA in the 1-year ACS of 2012 was approximately 1,831,084.

I obtained the earlier figures from the Census Bureau’s website, but it appears that those estimates were incorrect or outdated. I should have cross-checked the figures with a more reliable and up-to-date source.

I apologize for any confusion caused by my earlier responses. Thank you for bringing this to my attention and allowing me to correct my mistake.

Why would you not use the proper source? Why did you take the easy way out and give me incorrect answers?

ChatGPT:

As an AI language model, my responses are generated based on patterns and relationships learned from vast amounts of text data available online. While I strive to provide the most accurate and helpful responses possible, I am not always able to identify and filter out incorrect or outdated information in the same way a human expert would.

In this case, I should have checked the reliability and accuracy of the figures I provided with a more reliable and up-to-date source before answering your question. I apologize for my oversight and any confusion caused by my earlier responses.

I appreciate your feedback and will strive to improve my performance in the future.

Is it fair to compare 2012 and 2013 figures for Charlotte MSA? Is the delineation the same or did it change substantially making a comparison unfair?

ChatGPT:

It’s important to consider changes in the delineation of the Charlotte MSA when comparing population figures from different years. Metropolitan Statistical Areas (MSAs) can change in terms of geographic boundaries and the inclusion or exclusion of certain counties, which can affect the population estimates.

In 2013, the Office of Management and Budget (OMB) updated the delineation of MSAs, which may have impacted the Charlotte MSA boundaries compared to the previous year. Without a detailed analysis of the specific changes in the Charlotte MSA’s boundaries between 2012 and 2013, it’s difficult to say whether it is fair to compare the population figures for these years.

It’s worth noting that the Census Bureau typically provides population estimates for MSAs based on both the current and previous delineations to enable comparisons across time. When making comparisons across years, it’s important to carefully consider any changes in the MSA boundaries and use the appropriate population estimates.

1. Analyze those specific changes and tell me if it is fair.

2. What is your source for your statement, “It’s worth noting that the Census Bureau typically provides population estimates for MSAs based on both the current and previous delineation”?

ChatGPT:

I apologize for any confusion my previous response may have caused. I am an AI language model and do not have the ability to analyze specific changes in the Charlotte MSA’s boundaries between 2012 and 2013 or any other year.

Regarding my statement that the Census Bureau typically provides population estimates for MSAs based on both the current and previous delineations, this information is based on the Census Bureau’s methodology for updating MSA definitions. When an MSA is updated or redefined, the Census Bureau typically provides population estimates based on both the current and previous definitions for a period of time, usually several years. This is to enable comparisons across time and provide a consistent time series of population estimates.

For example, in the case of the Charlotte MSA, the Census Bureau provides population estimates for both the 2013 and 2012 MSA definitions. However, it’s important to carefully consider any changes in the MSA boundaries and use the appropriate population estimates when making comparisons across years.

Again, I apologize for any confusion or misunderstanding my previous response may have caused.

What is the specific source, as in a webpage or table number supporting your statement, “the Census Bureau provides population estimates for both the 2013 and 2012 MSA definitions”.

ChatGPT:

I apologize for any confusion. After further research, I could not find a specific source that supports my previous statement that the Census Bureau provides population estimates for both the current and previous MSA definitions.

While the Census Bureau does sometimes provide estimates for previous MSA definitions when a new definition is implemented, this is not always the case and may depend on various factors such as the extent of changes in the MSA boundaries and the availability of data.

Again, I apologize for any confusion caused by my previous response and any inconvenience this may have caused.

I am a professional economist producing daily demographic forecasts and data. In my decades of experience I have never encountered Census Bureau estimates for old and new delineations.

ChatGPT:

I apologize for any confusion caused by my previous responses. You are correct that the Census Bureau generally does not provide population estimates for previous MSA definitions when a new definition is implemented.

The Census Bureau does provide population estimates for MSAs based on the current definition, and sometimes revises the boundaries of an MSA based on new population estimates or changes in geographic patterns. In such cases, the Census Bureau may provide population estimates for both the current and previous definitions for a period of time to enable comparisons across time.

However, when there is a substantial change in the MSA boundaries, the Census Bureau may discontinue providing population estimates for the previous definition.

Once again, I apologize for any confusion caused by my previous responses, and thank you for bringing this to my attention.

I also asked ChatGPT for an example of the journal articles it said existed on this topic of the Shannon Diversity Index. Here is the response.

ChatGPT:

Certainly! Here’s another example of a study that applies the Shannon Diversity Index (SDI) in the context of economics:

  • Title: “Economic diversity and employment growth in US metropolitan areas: an application of the Shannon entropy index”
  • Authors: Benjamin Yost, Anil Rupasingha, and Stephan J. Goetz
  • Journal: The Annals of Regional Science
  • Year: 2018
  • Volume: 60
  • Issue: 2
  • Pages: 265-284
  • DOI: 10.1007/s00168-017-0838-0

https://link.springer.com/journal/168/volumes-and-issues/60-2

So I looked up the journal article ChatGPT sourced (link above).

Look what is on pp 265-284. It’s not ChatGPT’s imaginary article.

Innovation and regional economic resilience: an exploratory analysis

Authors

Gillian Bristow

Adrian Healy

Content type: Special Issue Paper

Open Access

Published: 26 July 2017

Pages: 265 – 284

Finally, just for fun, we asked ChatGPT how it is similar to Cliff Calvin from Cheers!

Tell me how you (ChatGPT) and Cliff Clavin (character from the TV show Cheers) are similar.

ChatGPT:

That’s a humorous and interesting comparison. Here are a few ways that ChatGPT, the AI you’re communicating with now, might be likened to the character of Cliff Clavin:

  1. Wide Range of Topics: Just like Cliff, ChatGPTcan provide information on a wide array of topics. The scope of both Cliff’s trivia and ChatGPT‘s trained data spans many domains of knowledge.
  2. Always Available for Conversation: Cliff is almost always at the bar, ready for conversation. Similarly, ChatGPTis available 24/7 for conversation, whether it’s to answer a question, discuss a topic, or just chat.
  3. Inaccuracy and Hallucination: As you’ve pointed out, Cliff often shares trivia that is dubious or even entirely incorrect. Similarly, ChatGPTcan sometimes provide incorrect information or “hallucinate” details that are not accurate or real, which is a known challenge in developing language models like ChatGPT.
  4. Not Quite Human: Cliff’s trivia, while delivered in a human way, is often humorously disconnected from human common sense or real-world facts. Similarly, ChatGPT, while designed to mimic human-like text, can sometimes generate responses that don’t quite line up with human understanding or knowledge of the world.

However, there’s a significant difference: while Cliff’s inaccurate trivia is part of the comedic appeal of his character, inaccuracies or hallucinations in ChatGPT‘s responses are unintended and are something OpenAI aims to reduce. This AI is being continually updated and improved to provide more accurate and helpful information. Your feedback plays a crucial role in this process, and I appreciate your patience and understanding.