Comparing geologic maps and large language models: All models are wrong, some are useful

A post mortem on an experimental web application that combined geologic maps with LLMs

When discussing errors of Large Language Models (LLMs) the term 'hallucination' is used a lot to discuss factual errors. To some extent I understand why that term is used, as 'hallucination' was previously used to describe early generative imagery from Google's Deep Dream model around 2015, which did indeed look dreamy.

Mountain scene using Googles Deep Dream Model from 2015 Image generated with Google's Deep Dream model circa 2015.

LLMs also first got huge public attention in the form of ChatGPT's search box interface, which reminds everyone of the google search box, which does pull from an index of real websites. Hence, it makes sense that people would expect it to be pulling from an index of real facts and blame the model for making up things that do not exist.

However, the framing of errors as 'hallucination' always struck me as a bit of an awkward framing or maybe just not a very useful framing. Despite the similarities to traditional text search interfaces, LLMs aren't built in a way that it would be reasonable to expect them to be magic eight balls that pull exclusively from repositories of facts. Although they can operate that way to some extent when combined with a document store using the RAG (Retrieval-Augmented Generation) approach, a LLM by itself is built to be probabilistic model of human language that extend text given in prompts. It doesn't know facts, it knows probabilistic relationships between words and phrases. The saying that pops into my head for LLMs is "all models are wrong but some are useful". This phrase is often heard when discussing scientific models. I like this saying for LLMs as it affirms that LLMs are going to be wrong, we should be thinking about how we should expect them to be wrong, and we should be thinking about how to reasonably expect them to be useful.

Another reason to like the saying is it invites a comparison between geologic maps, which are one type of model, with LLM generated geology field trip point descriptions, which are a different type of model. The ways in which these two different types of models are wrong but still useful is interesting to analyze.

This blog post will attempt to do a few things at once. First, it will serve as a post mortem for an experimental web application I built as a side project in 2023 that applies GPT3 and GPT4 large language models to the problem of generating plausible language that you might hear said by the leader of a geology field trip at any random point location on Earth. Second, this blog post will attempt to describe the similar and different ways in which two types of models, geologic maps and LLMs, can be both wrong and still useful. Finally, it will try to point out opportunities for using LLMs to address some of the traditional limitations of geologic maps and consider how LLMs might power new interfaces for digital geologic maps.

What was built

Outline of the web application: LAGDAL

LAGDAL is an acronym for Large language model Assisted Geology Descriptions of Arbitrary Locations. A previous blog post summarized what was being attempted very early on in the process. You can also read the README.md of the GitHub repository for LAGDAL to learn more.

The basic idea is a user can navigate an area using synchronized google street view image, topographic map, and geologic map windows. When they found a point location that looks interesting, they can press a button to generate the text for an geology field trip description for that exact point location. The goal is to return a realistic sounding, accurate description of both the local and regional geology at a college 101 class level.

Where to try it out

==========================================================================================

UPDATE: THE LIVE SITE AT https://app-lagdal.azurewebsites.net/ HAS BEEN TURNED OFF

==========================================================================================

The live website, which may or may not be still live when you read this, is at https://app-lagdal.azurewebsites.net/ You can also clone the https://github.com/JustinGOSSES/LAGDAL repository locally and then follow the instructions to set up API keys for OpenAI, google streetview, and Bing maps such that the application runs on your local computer. There is also a version of the application without the LLM part that is entirely front-end with no sever side code. This means you can run it entirely as a static website, as seen on this GitHub pages page.

Screenshots of web application (LAGDAL) at a point in New York City's Central Park

New York City, screenshot A New York City, screenshot B New York City, screenshot C New York City, screenshot D

Who was it built for?

The application had a user in mind who was someone with geologic knowledge looking to explore a new area, maybe searching for good field trip stop locations. It was also built with the question in mind of how much closer LLMs could take us to a completely automated field trip guide, at least at some basic level. Normally, creating geologic field trips require significant research by one or more experts who compiles for information in numerous published maps, reports, and papers. This is a high initial barrier if you only want the most basic description.

How I used it: on vacation

Most of my use of it was in testing the code, but I also used it for vacations. As someone trained as a geologist now working in a different field, I often like to learn a bit about the geology of the place I'm visiting, so I can understand the context of any outcrops I happen to see. LAGDAL is useful for this task as it lets me look in advance at where I might see outcrops and understand the different geologic units. It's better than just a basic geologic map as the topographic map combined with the Streetview imagery lets me easily see real outcrop pictures of each geologic unit before I'm there in person. The LLM field trip descriptions also provide some regional context that a geologic map description by itself doesn't not give.

I've used it on travel to Cleveland, Ohio, Seattle, WA, Provence France, and Taipei, Taiwan. I've found it more useful then the online Macrostrat geologic map alone or combined with Google/Bing search results.

What new capabilities were used?

Syncing of topographic, street, and geologic map as well as street view imagery.
Large language model generation of field trip descriptions
Combination of geologic map text as primer for LLM text generation

Basic flow of LLM text generation

The basic flow of text generation involved several API calls.

Find geologic unit data at a given latitude and longitude based on Macrostrat API.
Use bing map API to find geographic name from latitude and longitude.
Use OpenAI API to summarize geologic unit data from Macrostrat API into paragraph form from JSON format.
Use OpenAI API to get description of local geology using paragraph from previous step as prompt.
Use OpenAI API to get combined local and regional geology description using paragraph from last step in prompt.

These skills and a few others are available to an agent that picked which to use, but it mostly picked them in the order shown above.

Post Mortem: Non-model related

Too slow

The text generation involved multiple OpenAI and other API models, which meant the time between clicking the "generate" button and getting a description of the field trip location could be several minutes in total. This is too slow for many users, who are used to immediate feedback on websites. One way to minimize this might be to have the intermediate results printed to the website window as well. This is the experience when running via command line or in the standard ChatGPT interface and it results in less waiting but not zero waiting.

Poor experience on a mobile device

This prototype was developed quickly on a laptop, and it has not been optimized for a mobile experience. There are two elements that could be improved. First, the four map elements appear in the same locations on small mobile screens, which makes it hard to see fine details. Ideally, on mobile each map window would be full screen width and positioned in a vertical column of elements. This would make them easier to see. Second, the small size makes it annoying to navigate the map. Having each map element be full screen width would help with navigation of the map, but ideally there would also be a button to push to navigate the map to your actual location and an input form where you could type in an address for the maps to navigate to. Both of these would improve the map navigation experience on mobile.

Post Mortem: Model related

All models are wrong but some are useful

Geologic maps as imperfect models

As alluded to in the first section, geologic maps are imperfect models. However, they are still extremely useful despite having a variety of common errors.

Error type 1: Non-interesting things like dirt, trees, and your house are in the way

To start with, there's often other stuff on top of bedrock, be it dirt, vegetation, or buildings. Geologic maps try to indicate what's underneath all of that even if it is quite far down. What's shown on the map therefore can often be an imperfect guess at what's underneath. In landscapes with a lot of vegetation, the error rate can be higher. For instance, the geologic story of the local bedrock is a lot easier to figure out in New Mexico than Maryland due to all the exposed bedrock in New Mexico.

Error type 2: Not everyone wants to know the same things

Another class of error is when there's a difference between what the map maker tried to represent and what the person using the map wants to know. While geologic maps generally describe rocks, there is also geologists known as geomorphologists that study erosion and more recent unlithified sediments. The extent to which these more recent or unlithified sediments are included on geologic maps varies and can impact the maps usefulness for some users. Additionally, while geologic maps often focus on the lithology of bedrock, they can also describe structural features like strike and dip orientation of strata, age, or presence of fossils. Sometimes maps can be correct but not in a way that is useful to you.

An extension of this idea is that if the map makers are not focused on what you care out, that type of information might be overly generated in ways that result in errors at the scale that matters to you but not the map makers.

Error type 3: Resolution

It is not unusual for geologic maps to cover hundreds of square kilometers. As such, the geologic units represent large generalizations of the geology, rather than exact conditions at every point on the map. For example, it is not strange for a local outcrop to be 100% shale where as the geologic map might say the unit is mostly sandstone with isolate shale. Generalization is occurring within a map polygon that might hundreds of square kilometers.

Additionally, the spatial boundaries between units is often more complicated than what is represented on the map. The boundary between two units will be drawn on the map as a sharp line due to that being how maps are made, but there uncertainty of the exact position of the boundary could be hundreds of meters or even kilometers if there is dirt, vegetation, roads, buildings, etc. on top or the definition of one unit is not tightly constrained. Uncertainty in the boundary of two units even if exposed well can also be due to the lack of a sharp change in lithology or other characteristics in the two units. Where one formation ends and another begins can be fuzzy or hard to interpret without data like age dating that is extremely sparsely distributed or open to interpretation.

Despite all these potential sources of uncertainty, the boundary between two formations will always be drawn as a sharp line on the final map with basically none of that uncertainty conveyed on the 2D map. It's a data informed best guess model. The model is still extremely useful even if it is not perfect and certain parts are wrong or highly generalized.

You can see an example of this resolution uncertainty in the Macrostrat geologic map by navigating to the state of Colorado in the United States and zooming from far out to close in. The Macrostrat map website shows different geologic maps at different zoom levels. Although many of the boundaries of the geologic formations agree as you zoom out and in, some do not. Different geologic map authors use different levels of generalization and in some cases different interpretations.

Lower resolution areas of Macrostrat geologic map used as underlying data

The geologic map used on the LAGDAL application is from Macrostrat. It integrates a large number of digital geologic maps from different parts of the world. As such, there is quite a bit of variation in the resolution of the mapping. Taiwan has only a handful of geologic units mapped. As a result, there is a lot of differences between what's on the map and what's exposed in outcrops there. In contrast, the geologic maps in Texas or southern France are mapped in higher resolution with comparably less differences likely between what's on the map and what's exposed in outcrops.

Models can be wrong but useful if users have expectations for how they might be wrong

For all these reasons, it should not be all that surprising that when you're faced wih an outcrop that doesn't agree with the what the large scale geologic map suggests should be there. A key reason geologic maps are still useful even if its wrong at times is because most geologists have expectations on where and how it might be wrong. With LLMs being very new, people are still developing an intuition for when they might be right or wrong and why.

Large language models as imperfect models

LLMs errors in the context of the LAGDAL application are a combination of the errors inherited from underlying geologic map data used in initial prompts and LLM-specific errors.

Errors inherited from geologic maps

As noted above, the LAGDAL application chains together several prompts and resulting text output. The first use of the openAI API is to generate a natural sounding short paragraph from the data in a JSON object returned from the Macrostrat Map API for the given latitude and longitude of the selected point location. As the geologic map data is used in this first prompt and then recycled in each later part of the chain, errors within the geologic map will flow through to the final text output of the LLM application. In the section below, I will discuss the types of errors that are inherited from the geologic map and how they might be mitigated in the future with the advantages that digital maps and LLMs provide.

Error type 1: Dirt and other non-interesting things in the way

To a large extent these errors simply get inherited from the geologic map. LLMs have no other way of knowing what's underneath the dirt, vegetation, or buildings.

However, there may be future potential here for LLMs to be more useful than geologic maps simply because they aren't static products. As such, there could be ways to structure the prompts such that some of the uncertainty around cover gets included in the generated text. For example, it would be possible to start the prompt chain with both the geologic map data at that single point and several other points a short distance away. If there's differences in the description, then maybe that could be integrated into the generated text to warn the user that there are different units nearby. Additionally, information on the thickness of the units is often part of the data that builds the map, but not visually represented on the map due to the 2D spatial nature of the map. However, there's nothing stopping an LLM from stating the first unit is X but it may only be less than 2 meters thick so you might instead be seeing the second unit of blank.

Digital maps also have the potential to have multiple layers. Land use, land cover, and slope are all widely generated across the world and could be used as data supplied to LLMs to describe probability of outcrop locations in an area or probability of an outcrop being bedrock vs. recent sediment.

Error type 2: Geologists different in what content they would like to see represented on geologic maps

In the current prototype, the limitations of geologic maps are inherited by the LLMs. The focus of what the authors of the geologic maps that were integrated into Macrostrat is what is represented in the generated text.

However, this is an area where LLMs could really shine compared to geologic maps. While maps are static 2D representations of the geology by one group of people that is used by a different group of people, LLMs are dynamic and with prompts that can be altered by the user. It would be easy to stick a knob on the web application prototype that says "geologist, geomorphologist, structural geologist, or paleontologist" and then the field trip description be generated specifically for that type of geologist. You could also imagine options for experience level, where the generated text is more or less technical for audiences of elementary school students, high school students, undergraduates, or graduate students. Again this, was not build into the prototype but a few experiments with manually altering the prompts suggest that this would be a relatively easy thing to do.

Error type 3: Resolution

The resolution problem from geologic maps is largely inherited by the LLMs.

However, there are circumstances where LLMs can bring in higher resolution information. For example, when exploring Taipei, Taiwan, the geologic map is very low resolution. However, the agent in this case used the Bing Maps API to find a geographic location name and the LLM was apparently able to use that to pull in the geologic formation name and reference additional information about the volcanic history of Yangmingshan national park. This is information that likely existed in many texts in the LLMs training set and represents more information that was included in the geologic map. The LLM providing more local information than the macrostrat API was fairly unusual in testing. Usually, the LLM only provided regional context.

LLM specific errors

Large language models (LLM) are effectively probabilistic models of language. This gives them a lot of power to generate realistic text with little human generated deterministic programming required. However, that strength also shows up as a weakness. It makes them prone to certain errors where their probabilistic approach leads to occasional failures when a simply deterministic program would not.

Math

The classic example of this is that LLMs are worse at math then you might expect. There are lots of explanations out there on why that is that I suggest you do a web search for to learn more.

Where it often comes up in context of the LAGDAL prototype is when there are multiple formations being described in the description or when the description attempts to describe the time length over which a formation has been deposited. In LAGDAL, this limitation was attempted to be overcome by giving the agent that can pick what tool to use a math tool. Langchain is the framework used in LAGDAL and it has a llm_math_chain capability.

 Tool(
        name="Calculator",
        func=llm_math_chain.run,
        description="Useful for when you need to answer questions about math. Do not add percentages together without using this!",
        # args_schema=CalculatorInput
    ),

Incorporation of this skill helped reduce math related errors in the generated text, but it did not completely eliminate them as they agent still had to make the decision to use the skill. At a simply level, the agent use the math skill whenever there is a high text similarity between the description of when the math skill is useful, the prompt it is trying to solve for at that moment, and any other input text. However, this strategy is not a perfect, largely as it is not always obvious that the math skill should be used next instead of one of the other skills to pick from.

It should be noted that even though math related errors persists, models seem to be improving in regards to this type of error. I initially started this work with the text-davinci-003 model which has been depreciated. After moving from text-davinci-003 to gpt-3.5 to gpt-4 I have noticed a decreasing rate of these types of errors.

Confusion prompted by words being used in different contexts

One of the first problems I noticed when I started to experiment with LLMs and geology is that word distribution in training data can lead to confusion as the models capture of important context is not perfect. I live in Houston, which is a great example of this. Houston is a city with a lot of oil and gas geologists who publish a lot of papers. As such, the bulk of the documents that mention the word "Houston" have nothing to do with the geology of Houston. If you used GPT-3 models and asked about Houston geology, you would sometimes get information about Gulf of Mexico geology as that is the context most often addressed by documents that include the word Houston and never the geology of the bedrock of Houston, which is what you were trying to ask about.

To get around this problem, I first grabbed data on the surface bedrock from the Macrostrat API, used a GPT API model to parse the JSON data into a naturally worded paragraph and then used that as part of a prompt to another GPT API call. By adding information from a deterministic source to the prompt, it constrains the LLM generated text to a more narrow probability distribution that is more likely to include the information you want. As a result, the Houston results are closer to what I would expect.

Ordering of units

LLMs can not reason in the human sense, they can only replicate the results of reasoning through probabilistic means when there is a strong correlation between reasoning and language used to describe the reasoning. Where this shows up in the LAGDAL prototype is in the ordering of units. Sometimes you would see inaccurate description of the order of units, either in terms of what unit is on top or the age relationships.

Ordering of events

Similarly, the descriptions would sometimes write errors related to the ordering of events in the geologic history of a single formation. They might not understand that a rock unit was initially deposited as sedimentary and then later deformed into metamorphic leading to a description that says on rock unit is sedimentary and then another rock unit is metamorphic when both are metamorphic but the first unit merely has less description that goes into the metamorphic history. Again, this goes into the LLMs inability to reason and it's reliance on probabilistic patterns in the training data.

Replicating bad habits of field trip leaders

One of the more funny problems I have seen with the LAGDAL prototype is that it has a tendency to replicate the bad habits of field trip leaders concerning over usage of trite phrases and filler phrases. I suspect I could probably reduce these if I really wanted to by adding a phrase to the last prompt that says "please avoid using trite phrases".

Examples of trite phrases include:

"fascinating location with a rich geological history" or "fascinating geology of the area"
"this is a very interesting location."
"These rocks were likely deposited in a variety of environments" or "rocks we see here were likely deposited in a variety of environments"
"layers upon layers of sediment accumulated"
"These rocks provide valuable insights"

These phrases are not wrong, but they often add little of informational value to the description. You can always say a location is interesting or fascinating, but it's better to say why it's interesting. Yes, sedimentary rocks are always layered that can be observed, but unless you're talking about a specific pattern or type of layering, there's no need to say it.

I will hypothesize that there are three reasons these phrases are so common. First, they are often found in some form in field trip guides that probably were used in training of the LLM. Second, the form that is more widespread is the form that isn't specific, so that ends of being higher probability. Third, the text from the geologic map API that is used to construct the initial prompt will nearly always have the end interpretation described, not the field observations. As a result, there is little information in the prompt that will ever talk to specific field observation that would be required to make some of these phrase information rich instead of generic. Finally, in one of the chain of prompts used in the web application , there is the instruction "Describe the geologic history in narrative fashion over 6-10 sentences as a professor leading a geology field trip.". This language in the prompt heavily biases the final text to include phrases specific to field trip guides, but the data used as input only includes interpretations not field observations. As a such, the final predicted text includes an overabundance of generic field trip guide phrases that lack observational field detail.

Evaluation of example generative field trip stop descriptions

These examples were produced in December, 2023 using a combination of text-davinci-003 and gpt-4 models. The LAGDAL applciation is a flask app that primary uses the agent_website_explore.py file seen in the LAGDAL repository. This "agent" approach solves for a single prompt by probabilistically picking from different skills where each skill is either a deterministic function that calls an API or makes a calculation or a LLM API call that uses a specific model and prompt to generate text. The "agent" chains together different skills to produce a final generative text result. As such, if you would generate geology field trip descriptions at the exact same locations with the exact same code, you would not get the same output text.

Port Clinton, Ohio, USA

Port Clinton Screenshot

Geology Response: Good morning, everyone! Welcome to our geology field trip. Today, we find ourselves at a 
fascinating location with a rich geologic history. Directly in front of us, we have an outcrop that provides 
a glimpse into the geologic story of this region.The uppermost surface geology at our location is primarily 
composed of Paleozoic sedimentary rocks, which have an age range of 251.902 million years to 538.8 million years. 
These rocks indicate a sedimentary environment for their deposition, suggesting that this area was once 
covered by ancient seas or lakes. As we examine these rocks, we may come across fossils, providing evidence of 
the diverse marine life that once thrived here.Beneath the Paleozoic sedimentary rocks, we have a layer of 
Middle Devonian sedimentary rocks, with an age range of 382.7 million years to 393.3 million years. These rocks 
also suggest a sedimentary environment, further supporting the idea of a marine or lacustrine setting. The Middle 
Devonian period was characterized by the diversification of fish and the emergence of early tetrapods, so keep 
an eye out for any fossilized remains that may be present.Now, let's zoom out and consider the regional geology. 
Our location lies within the broader context of the Appalachian Basin, which stretches across several states in 
the eastern United States. The Appalachian Basin is known for its extensive sedimentary rock formations, which 
were deposited during various geologic periods.In this region, the Paleozoic sedimentary rocks we see at the 
surface are part of a larger sequence known as the Appalachian Plateau. These rocks were formed as layers of 
sediment accumulated over millions of years, eventually becoming compacted and lithified into solid rock. The 
Appalachian Plateau is characterized by its gently rolling hills and scenic landscapes, shaped by erosion and 
tectonic forces over time.The Middle Devonian sedimentary rocks we observe here are part of the broader Devonian 
system, which is well-represented throughout the Appalachian Basin. During the Middle Devonian period, this 
region experienced a significant marine transgression, with seas advancing and retreating multiple times. This 
dynamic environment led to the deposition of various sedimentary rocks, including the ones we see today. 
In conclusion, the geologic story at our location reveals a history of sedimentation in a marine or lacustrine 
environment during the Paleozoic and Middle Devonian periods. These sedimentary rocks are part of the larger 
Appalachian Plateau and Devonian system, which are characteristic of the regional geology in this area. 
As we explore further, let's keep our eyes open for any clues that may shed light on the ancient environments 
and life forms that once thrived here. Enjoy the rest of our field trip!

Critique of Port Clinton text

You get a trite filler phrase to start out Today, we find ourselves at a fascinating location with a rich geologic history. Love the enthusiasm, but it is not specific to the location and the repetition of "fascinating" in these descriptions quickly gets old.
One of the problems with the macrostrat API that I had solved but came back after I fixed another problem is that the API it will return two layers that are basically both surface layer but one from a higher resolution map and the other from a lower resolution map instead of two layers that are on top of one another. There is no obvious way to tell which result is the higher resolution version. As a result, you'll sometimes get errors like this where it says the Paleozoic sedimentary rocks are from the Paleozoic era, which is 251.902 to 538.8 million years old. The Middle Devonian sedimentary rocks are from the Middle Devonian era, which is 382.7 to 393.3 million years old. but its talking about the same bedrock just different geologic maps have different levels of specificity in there information. The Middle Devonian is a subset Paleozoic, not a separate unit. I suspect this is a problem I could handle with more investigation of the Macrostrat API, but I never got around to it.
The phrase the geologic story at our location reveals a history of sedimentation in a marine or lacustrine environment is incorrect. The rocks at this location are all marine. Further east there are some lacustrine rocks but nothing anyone would call at this location or really even probably ever mention at this stop. Although we are constraining the local point geology with the geologic map, the LLM is being asked to also discuss regional geology with the only thing constraining it being the already generated summary of local geology at that location. That sometimes leads to problems and as shown here. One possible solution would be to supply information on regional geology from calling different point locations on the geologic map, leveraging wikipedia articles about the regional geology of the state or country, or some other approach that constrains the LLM in some way. A wikipedia + LLM approach was trialed early on and seemed successful inside the United States of America using wikipedia's API and a "state of ___ geology" search to find pages with regional geology information. However, this approach was less successful outside the United States of America as the wikipedia pages were not as consistent in their structure and content. Sometimes the first X number of wikipedia results to the country in question would all be about something other than geology.

Houston, Texas, USA

Houston Screenshot

Location: latitude = 29.760082, longitude = -95.432943

Geology Response: Good morning, everyone! Welcome to our geology field trip. Today, 
we find ourselves at latitude 29.760082 and longitude -95.432943, where we have a fantastic 
opportunity to explore the fascinating geologic history of this region.At this point 
location, the uppermost surface geology is composed of Pleistocene sedimentary rocks, 
which date back to approximately 0.0117 million years ago. These sedimentary rocks 
were formed during the Pleistocene Epoch, a time characterized by repeated 
glaciations and interglacial periods. The Pleistocene sediments here consist of various 
types of sedimentary rocks, such as sand, silt, and clay, which were deposited by rivers, wind,
 and other natural processes.Beneath the Pleistocene sediments, we have the Beaumont Formation, 
 which is composed primarily of clay and silt. This formation dates back to around 0.6558 million 
 years ago and provides us with valuable insights into the geological processes that shaped 
 this region. The Beaumont Formation was deposited in flood basins, coastal lakes, and former
stream channels on a deltaic plain.Now, let's zoom out and take a look at the broader regional
geology. Our current location lies within the Gulf Coastal Plain, a vast region that stretches 
from the Gulf of Mexico to the interior of the continent. This plain is characterized by a 
relatively flat topography and is primarily composed of sedimentary rocks.The Gulf Coastal 
Plain was formed over millions of years through the deposition of sediments eroded from the 
surrounding highlands and carried by rivers and streams. These sediments were then deposited 
in various environments, including river floodplains, deltas, and coastal areas, resulting 
in the diverse sedimentary rock formations we see today.Throughout its geologic history, 
this region has experienced significant changes in sea level, climate, and tectonic activity,
all of which have influenced the deposition and erosion of sediments. The repeated cycles 
of sea-level rise and fall during the Pleistocene Epoch, for example, played a crucial role
in shaping the landscape we see today.In addition to the Pleistocene and Beaumont Formation,
other notable rock units in this region include the Lissie Formation, the Frio Formation,
and the Yegua Formation, each representing different periods of deposition and 
environmental conditions.As we continue our field trip, I encourage you all to observe 
the various rock outcrops and formations we encounter along the way. By studying the 
geology of this region, we can gain a deeper understanding of the Earth's dynamic processes
and the intricate history that has shaped our surroundings.Let's move on and explore
 more of the fascinating geology this region has to offer.

Critique of Houston text

A minor critique is that the phrase Pleistocene Epoch, a time characterized by repeated glaciations and interglacial periods might be read to imply that glaciation reached this location, which it did not. That being said, every field trip I have been on in the Houston area mentioned glaciation in terms of global sea level rise and changes in sedimentation, so not unexpected to see it mentioned here from a probability standpoint just the complete context was not explained as well as it might have been.
Other than that, this is a pretty good description of the geology of the Houston area and much improved upon the initial testing with ChatGPT in spring of 2023 when some of the content would be reflecting geologic writing by people in Houston, not the surface bedrock of Houston.

Bocas del Toro, Panama

Bocas del Toro Screenshot

Location: latitude = 9.344765, longitude = -82.170057

Geology Response: Good morning, students! Welcome to our geology field trip. Today, we find ourselves at a fascinating 
location with coordinates latitude = 9.344765 and longitude = -82.170057. As we observe the rock outcrop in front of us,
we can see that the uppermost surface geology is Neogene sedimentary.This layer of sedimentary rocks tells us a story 
about the geologic history of this region. The Neogene period, which began approximately 2.58 million  years ago and 
ended around 23.03 million years ago, was a time of significant geological activity. During this period, the Earth 
experienced major climatic changes and the emergence of diverse life forms.The presence of sedimentary rocks suggests that this area was once a sedimentary 
environment, such as a river delta or a shallow marine environment. Sedimentary rocks are formed through the 
accumulation and compaction of sediments, which can include materials like sand, silt, and clay. Over time, these 
sediments become lithified, forming solid rock layers.Now, let's zoom out and consider the regional geology. Our current
location lies within the broader context of Central America, a region known for its complex tectonic history. 
Central America is situated at the boundary between the Caribbean Plate and the Cocos Plate, which are both part 
of the larger tectonic puzzle in this area.The collision and subduction of these plates have played a significant 
role in shaping the geology of Central America. The ongoing subduction of the Cocos Plate beneath the Caribbean Plate 
has led to the formation of volcanic arcs, such as the Central American Volcanic Arc, which stretches from Guatemala to 
Panama.In addition to volcanic activity, the tectonic forces at play have also resulted in the formation of mountain 
ranges and the uplifting of sedimentary basins. These processes have contributed to the deposition of sedimentary rocks, 
like the ones we see here, as well as the creation of diverse landscapes and ecosystems.Furthermore, Central America has 
a rich history of seismic activity due to the tectonic forces in the region. Earthquakes and volcanic eruptions are common 
occurrences, reminding us of the dynamic nature of our planet.In conclusion, the Neogene sedimentary rocks we observe at 
this location provide a glimpse into the geological history of Central America. They tell a story of ancient sedimentation 
in a sedimentary environment, shaped by the tectonic forces and volcanic activity that have shaped this region over 
millions of years. As we continue our field trip, let's keep our eyes open for more clues and evidence of the fascinating 
geologic processes that have shaped this part of the world.

Critique of Bocas del Toro text

Having been to this area, I can say it is more complicated than the geologic map available presents. This is limitation follows through to the description. It says The presence of sedimentary rocks suggests that this area was once a sedimentary environment, such as a river delta or a shallow marine environment which is true but what makes this area interesting is the fact that both in modern day and in the bedrock there's a mix of siliciclastic, carbonate, and a very small amount of volcanic rocks. In fact just down the beach from this view, there's pillow basalt peaking above the beach sand reflecting lava coming out underwater. However, all the Macrostrat geologic map API says is "AGE: Neogene to Neogene, AGE: 2.588 to 23.03 millions years, Lith: sedimentary". So, the LLM is doing the best it can with the information it has. The resolution of the geologic map doesn't allow for the more complicated and interesting geology to be reflected in the description.
I find the regional description is pretty good for a 101 level introduction.

Taiwan

Taiwan Screenshot

Geology Response: Welcome, students, to this exciting geology field trip! Today, we find ourselves at a fascinating location 
with coordinates latitude = 25.178327 and longitude = -238.437555. Direct your attention to the rock outcrop in front of us, 
which provides a glimpse into the geologic history of this region.At this point location, the uppermost surface geology is 
composed of intermediate volcanic rocks from the Pliocene epoch, approximately 5.333 million years ago. Although the thickness 
of this layer is unknown, it is a significant clue to understanding the regional geology.To fully comprehend the geologic 
story of this area, we must consider the broader context. The Pliocene epoch was a time of intense volcanic activity in this region. 
The volcanic rocks we observe here were formed through the eruption of magma from the Earth's mantle. These eruptions were likely 
associated with the movement of tectonic plates and the creation of new crust.As we delve deeper into the regional geology, we discover 
that this area is part of a larger volcanic arc. Volcanic arcs are formed when one tectonic plate subducts beneath another, leading 
to the melting of the subducting plate and the subsequent eruption of magma. This process is known as subduction, and it results in the 
formation of volcanic mountains and the release of volcanic gases.The presence of intermediate volcanic rocks suggests that this region 
experienced a subduction event involving the convergence of two tectonic plates. The magma that formed these rocks likely originated 
from the partial melting of the subducting plate, which was rich in silica and other minerals. As the magma rose to the surface, it 
cooled and solidified, forming the volcanic rocks we see today.Furthermore, the age of these volcanic rocks provides valuable information 
about the tectonic history of this region. The Pliocene epoch corresponds to a time when the tectonic plates in this area were actively 
converging. This convergence resulted in the formation of a volcanic arc, which extended along the western margin of the continent. 
In conclusion, the intermediate volcanic rocks we observe at this point location offer a glimpse into the dynamic geologic history of this 
region. They indicate a period of intense volcanic activity during the Pliocene epoch, likely associated with the subduction of tectonic 
plates and the formation of a volcanic arc. By studying these rocks and their context within the regional geology, we can gain valuable 
insights into the processes that have shaped our planet over millions of years.

Critique of Taipei text

The prompt asks for a 101 level description of the geology, so just saying part of larger volcanic arc` is maybe okay, but it does generalize the geology a bit as can be quickly seen in this wikipedia article on the geology of Taiwan.
This exact location is in Yangmingshan National Park, which is the location of a volcano that was recently determined to be more active than previously thought. The LLM description doesn't mention this at all, probably as the geologic map for this location is quite coarse and most write ups are relatively recently. However, if you were on an actual field trip there, it would be the primary thing you would want to talk about. Again, the generative text is limited by the underlying geologic map data to a certain extent though I would not be entirely surprised if future LLMs knew more about the geology of this area.

Cassis, France

France by sea Screenshot

Location: latitude = 43.202428, longitude = 5.498101

Geology Response: Good morning, students! Welcome to our geology field trip. Today, we find 
ourselves at a fascinating location with coordinates latitude = 43.202428 and longitude = 5.498101. 
As we stand here, we can observe an intriguing rock outcrop that provides us with valuable 
insights into the geologic history of this region.At this point location, the uppermost surface geology 
is composed of carbonate sedimentary rock and limestone of Early Cretaceous age. This layer was likely 
deposited in a continental shelf setting. This tells us that during the Early Cretaceous period, this area 
was covered by a shallow sea, where the accumulation of carbonate sediments and the growth of marine 
organisms led to the formation of limestone.Now, let's zoom out and explore the broader regional geology.
 The region surrounding our current location is part of the Provence-Alpes-Côte d'Azur region in southeastern 
 France. This area is known for its complex geologic history, shaped by tectonic forces and the 
 interplay of various sedimentary environments.During the Early Cretaceous period, the region was situated 
 on the northern margin of the Tethys Ocean, a vast body of water that separated the supercontinents 
 of Gondwana and Laurasia. As sea levels rose and fell, the area alternated between marine and 
 terrestrial environments, resulting in the deposition of different sedimentary layers.Over time, 
 tectonic forces caused the closure of the Tethys Ocean, leading to the collision of the African 
 and Eurasian plates. This collision resulted in the formation of the Alps, a majestic mountain range 
 that stretches across several European countries, including France. The intense pressure and folding 
 associated with this collision caused the rocks in our current location to be uplifted and exposed. 
 As we examine the limestone outcrop, we can see evidence of the tectonic forces that shaped this region.
The rock layers may exhibit folding, faulting, or even metamorphism, indicating the immense pressures 
and temperatures they experienced during the mountain-building process.In addition to the tectonic forces,
the region's geologic history has also been influenced by erosion and weathering. The action of wind,
water, and ice has sculpted the landscape, exposing the underlying rock layers and creating the 
diverse topography we see today.In conclusion, the carbonate sedimentary rock and limestone we 
observe at this point location provide us with a glimpse into the Early Cretaceous period when 
this area was submerged beneath a shallow sea. Zooming out, we see that the regional geology 
has been shaped by tectonic forces, resulting in the formation of the Alps and the exposure of 
these rocks. Erosion and weathering have further contributed to the unique landscape of this 
region. I hope this overview of the geologic story around our current coordinates has piqued your 
interest and sparked your curiosity for further exploration. Let's continue our field trip and 
discover more fascinating geologic features!

Critique of Cassis, France text

Nothing is obviously wrong here that I can tell. Consider if impressive given the Macrostrat API data provided on local geology is pretty bare.
While using the LAGDAL application in different areas nearby while on vacation, I can say that sometimes the boundaries between the different geologic units are not perfectly captured by the map. There were a few times when the geologic map and LLM would say I was in one geologic unit, but it would be 0.25 kilometer down the road. This is a limitation of the underlying geologic map API inherited by the LLM generated text.

Discussion

LLMs make tasks easy that were previously too effort intensive to attempt with deterministic code

Prior to the explosion of text generative AI, I made an JavaScript Observable Notebook called "Stratigraphic Speech" that you can try out here. It automatically takes your location via a browser API, call the macrostrat API, and generates speech that tells you about the geology at your current point location through hard coded extraction of Macrostrat API JSON data into a single speech template. I used it on a drive between Texas and Ohio, so my wife wouldn't have to constantly find the right page in the the "Roadside Geology" books I had. It technically worked but quickly got boring as it said the same template again and again. Additionally, it had no regional context and was limited to repeating lithology and age, nothing else.

Turning JSON formatted data of sometimes irregular content into natural language is super painful to code deterministically by hand as there are so many possible slight variations to consider. It makes for a very long tail of edge cases, exceptions, and toil. One of the main advantages of LLMs is they take away the need to solve for that long tail. They know how to say things in a way that is natural to humans, and they can do it for a wide variety of inputs. This makes creation of something people want to listen to possible at low cost when simply wasn't before.

Another learning was that when constrained by location and local geology, LLMs can do a decent enough job of regional geology, at least at the 101 level of detail as shown in the examples above. Results are superior to previous efforts that either worked via traditional deterministic code manipulating Macrostrat API data or only used API calls to OpenAI models without any constraining details.

Necessary improvements before a LAGDAL-type application is functionally useful for most users

Non-model related improvements specific to LAGDAL

Speed: As describe above, the current text generation experience is way too slow to be useful due to needing to chain together multiple API calls. To some extent, this can be solved through better software engineering as this was a quick prototype. Reductions in model size and inference speed that will certainly occur in future will also help. -- Interface The current interface can be a bit of a pain to use on mobile.
Multiple inputs: Currently, the only way users interact is to move around a map and click. Ideally, you could also write a location or latitude and longitude into a text box.
Visualization of available streetview imagery: Currently, the only way to know if there is streetview imagery available is to click and see if the streetview image appears. It would be nice to have a visual indicator of where streetview imagery is available. This is possible to integrate, just didn't have time.

Model related improvements specific to LAGDAL

Better handling of when to use math skill: Given these errors tend to show up in repeated ways, there is likely a path to reducing their usage by either chaining additional prompts that look for these types of errors in the final text or by duplicating the math skill and modifying the description of the math skill in each version to be more likely to be used in the right context.
Better handling of geologic map API result layering: As noted above, there are circumstances where multiple geologic maps are available for a location at different levels of resolution. The current implementation just takes the first one or in some cases both. It would be better to always select the most high resolution map data available.

Future opportunities to improve geologic maps with LLMs

Geologic maps are a critical tool for geologists and geoscientists as well as general public. They are used to understand the geologic history and structure of an area, to identify natural resources, and set expectations of what to see in outcrop. LLMs integrated with geologic maps could improve a few of the normal limitations of 2D static geologic maps.

Accessibility

Geologic maps are traditionally a visual medium with a lot of information packed into a small space requiring looking back and forth between locations on the map, map symbols, legends, and accompanying texts. This can be difficult for people with visual impairments. LLMs could be used to generate audio descriptions of the geology at a point location on the map. It also could open up ways to leverage maps when eyes are busy, for instance when driving. Adding speech-to-text and text-to-speech interface on top of LLMs + geologic maps could enable interactive question and answering about the geology of an area.

Geospatial analysis functions combined with LLM-driven natural language interface as a new way to interact with map data

The LAGDAL application had a single task, produce a plausible geology field trip description for a single point location. There are many other tasks that could be done with LLMs and geologic maps, especially if a few more non-LLM skills were added to the mix of what is available for an agent. Native spatial analysis skills could include:

Analyze the raw geologic map data of two formations to find lithology type or characteristic possibly only present in one of them.
Finding the intersection of a driving route and the geojson of the geologic map to identify all the formations that will be intersected.
Of all the formation boundaries across over a route plan, which has the longest time gap represented by the boundary?
Find the nearest point within a specific formation from your current location.

There are all fairly typical geospatial analysis tasks that could be done with a few existing JavaScript or Python geospatial packages working on the geologic map data in geojson format from the existing Macrostrat API. If skills like these were available, different tasks could be accomplished by the LLM enabled application. Geology field trip descriptions could be generated that call out the important local features at each stop in a multiple stop field trip and put them in the regional context. LLM-driven applications could tell the user about the geology generalized over an entire trip. How much longer will you be driving through Cretaceous limestone? What is the next geologic unit you will be driving through? Did we just cross into a different bedrock lithology or is this recent sediment cover? These are all questions that become answerable in text or speech format without needing to visually examine or interpret the geologic map or accompanying reports directly.

Improved handling of uncertainty

Geologic maps are inherently uncertain, but their nature as static visual 2D representations does not allow for much nuance in how uncertainty is visually represented. LLMs could be used to generate text that describes the uncertainty. As LLMs are digital, they are not constrained to a static 2D representation. Functions could be created to include in the description when your location is near the boundary with another unit and uncertainty is higher. Additionally, uncertainty can be higher if the surface in unit is question is thin, meaning the next unit down might actually be what is exposed in outcrop. Additionally, if unlithified sediment is very thick at that location, you might want to know that in addition to the first bedrock. A large portion of the information necessary for these types of interactions is already determinable from the Macrostrat API, so it could be included in a future version of the generative field trip description.

Additionally, the current LAGDAL interface is a rather inflexible type of input (clicking on a map), but that is not the only possible interface. That style of interface was chosen for this prototype to avoid entirely the problem of prompt injection that comes from user provided text inputs into LLM applications. However, if you did allow the user to input their own prompt, they could potentially ask uncertainty related questions like "give what we know about the local geology, how likely is it that I am looking at limestone vs. dolomite?" or "I am at the bottom of a very small canyon that might not be reflected on the geologic map, what is the next unit down"?

Make one map for multiple audiences

Another feature of geologic maps that was just alluded to is that they are highly generalized interpretations based on many observations. That underlying observational data is not included in the map, which limits the ability of the LLM to describe what observations might be possible at a given point location. If LLMs had access via APIs to some of the underlying observational data several things become possible.

First, they could describe what observations are possible in more detail. For example, they might be able to describe the exact type of fossils found in a given unit, or the exact type of sedimentary structures, instead of saying generalities and final interpretations.

Second, if the underlying data is available, it will sometimes be able to match the description to the needs of the user. For example, a someone might want to know about the fossils, but another person might want to know about the sedimentary structures or structural features like faults or joints. You can imagine a drop-down interface where the user selects the type of geology they are interested in hearing about.

Macrostrat has some information for fossils available in their existing API already, so it is possible to already build a subset of what is imagined here.

Take away messages for building future geologic map & LLM experiences

Large language models have the ability to summarize data into natural language at much lower cost than was possible with traditional programming.
Local geologic map data was effective at reducing error rates when asked to generate description of regional geology, presumably by narrowing the probability of the LLM output.
Opportunity exists for new interfaces that expand how geologic map data is used, including: speech-to-text and text-to-speech interfaces, question and answering, and natural language directed geospatial analysis.
Large language models allow for repeated, and potentially on the fly, generation of text for different purposes based on the same underlying geologic datasets overcoming the static limitations of traditional geologic two dimensional visual maps where only a single final representation of reality is possible.