Parsing PDFs from the Texas Parks and Wildlife Department

There is an often-repeated statistic that Texas is less than 3% public land. I don't know the accuracy of that statement, but most of Texas is privately owned land. Thankfully Texas is the largest state in the 48 states of North America so 3% is still a large area. The biggest problem is determining exactly what is "public land" in Texas and who is allowed to do what there. What constitutes "public land" is a patchwork of Federal, State, County, City, and other regulatory frameworks. For example did you know that

  • LCRA, an electrical utility company in Texas, maintains several parks?
  • Quitaque, Texas hosts a trailway on the site of an old railroad line that is part of the Caprock Canyons Trailway?

I can't really explain all the categories of public land here in Texas, but one I found out about recently is what Texas Parks and Wildlife Department (TPWD) refers to as public hunting lands. The way this works is there are large areas of land that under their management. That land is then opened for hunting and other public activities. Since TPWD is a state level entity you might assume that each area is owned by the state. That isn't actually the case as some of the public hunting lands are actually part of National forest areas. Actual land ownership can be ignored for the most part as the regulations for access and usage are set and published by TPWD. You might also think this land is just for hunting, but that isn't actually the case either.

The access and regulation is where this process gets interesting. This is apparently governed by departmental policy, not by state law. TPWD has a number of different resources on their website. The primary document for this is the "Public Hunting Lands Map Booklet". This is distributed as an annual booklet, presently covering years 2025-2026. This document is very dense and contains hundreds of pages with differing information. Here's a view of the booklet cover and a sample from page

Half a page dedicated to ticks

There's at least half a page dedicated to tick borne diseases. While I suppose the public health aspect of this is admirable, it doesn't really help clarify land usage in any way. Each section of public hunting lands is generally referred to as a "Unit" and there is a page like this in the booklet for each unit. Here's what one of those looks like

A sample of one unit's informational page

This is the top of the "Justin Hurst WMA" which is Unit #721D. To make matters worse some units have their map presented in portrait orientation and some in landscape mode. In other words, good luck using this document on your phone.

There's another way to access this data, an interactive map you can use to find different areas. Each area is marked with a green star on the map:

You can use this map to pull up the details for an area if you know it's location. When you click on a location you see this popup

The popup has a link that says "Aerial Map - Public Hunting Lands Map Booklet (Official Hunting map)". Clicking this link takes you to a PDF with just a few pages like this

This document is just a subset of the pages from big booklet PDF I was referencing earlier. So if you just need to find data for one "Unit" and want to carry it around you can download it to your phone.

How do we search this data?

The problem with all of these PDFs is there is no real way to search the data based off parameters or at all. You can search the big booklet PDF with most viewers, but for example the word "water" appears 283 times. Most of which are references to "waterfowl" or something like that. I figured trying to extract this data into something easier to understand and work with would be an admirable goal.

At some point I discovered that TPWD publishes a KML file with this data. To be precise they publish a file called PublicHuntAreasDetailsKMZ2025-26.zip that is a zip file. Within that are the following files

  1. PublicHuntAreas.kmz
  2. DoveLeasePoly.kmz
  3. E1976F8EA1E24B49BC2F374AA5FC1B65.xsl

The file PublicHuntAreas.kmz is a KML file. The extension .kmz refers to a compressed KML file that is a ZIP file with an entry doc.kml that is the actual KML. The file I started from has SHA256 51b6ab67b9e7438f7f982c636fcb547e808dabdb63294bd92da1260633450e79. The KML entries contain data like this

<name>Public Hunt Polygons</name>
    <Snippet></Snippet>
    <description><![CDATA[Public Hunt polygon data]]></description>
    <Placemark id="ID_00000">
      <name>San Angelo SP</name>
      <Snippet></Snippet>
      <description><![CDATA[<html xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<head>
<META http-equiv="Content-Type" content="text/html">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body style="margin:0px 0px 0px 0px;overflow:auto;background:#FFFFFF;">
<table style="font-family:Arial,Verdana,Times;font-size:12px;text-align:left;width:100%;border-collapse:collapse;padding:3px 3px 3px 3px">
<tr style="text-align:center;font-weight:bold;background:#9CBCE2">
<td>San Angelo SP</td>
</tr>
<tr>
<td>
<table style="font-family:Arial,Verdana,Times;font-size:12px;text-align:left;width:100%;border-spacing:0px; padding:3px 3px 3px 3px">
<tr>
<td>SHAPE</td>
<td>Polygon</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>PH_UnitNum</td>
<td>1166</td>
</tr>
<tr>
<td>LocName</td>
<td>San Angelo SP</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>Class</td>
<td>Property Boundary</td>
</tr>
<tr>
<td>Active</td>
<td>Yes</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>Name_Acres</td>
<td>San Angelo SP - 2,500 acres</td>
</tr>
<tr>
<td>Acres</td>
<td>2500</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>KMLDescrip</td>
<td>&lt;strong&gt;San Angelo SP - 2,500 acres&lt;/strong&gt;&lt;br/&gt;&lt;a href=http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1166.pdf&gt;2025-26 Map Booklet for Public Hunting Lands&lt;/a&gt;&lt;br/&gt;*Zoom In to see property boundary</td>
</tr>

I was hoping the KML file would have both boundaries, names, & usage regulations. Unfortunately we are not that lucky. However, each entry in the KML contains an embedded HTML document that appears to be the popup we could see in the interactive map from earlier. That HTML document then contains an anchor element that links to the PDF file that is authoritative for just that unit.

Parsing XML is easy enough with Python. Interestingly this document contains far more entries than what I expected. Most are for things titled things like "Private In-holdings". This makes sense as a map file would need to show any private property boundaries adjacent to each unit. Ignoring this was easy as they don't contain any PDF links. I scraped through the KML and was able to extract the following links

NameUnit NumberLink
San Angelo SP1166http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1166.pdf
Elephant Mountain WMA725http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/725.pdf
Alabama Creek WMA904http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/904.pdf
Granger PHL709http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/709.pdf
Gene Howe WMA-W.A. (Pat) Murphy706http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/706.pdf
Luminant Texas607http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/607.pdf
Playa Lakes WMA - Taylor Lakes Unit751http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/751.pdf
White Oak Creek WMA727http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/727.pdf
North Toledo Bend WMA615http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/615.pdf
Black Gap WMA701http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/701.pdf
Lake Colorado City SP1096http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1096.pdf
Angelina Neches/Dam B WMA707http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/707.pdf
Matagorda Island WMA722http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/722.pdf
Resaca de la Palma SP1743http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1743.pdf
Huntsville SP1044http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1044.pdf
Sabine River Authority630http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/630.pdf
Alazan Bayou WMA - Blount Tract747http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/747E.pdf
Alazan Bayou WMA - Old River Tract747http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/747E.pdf
Gene Howe WMA755http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/755.pdf
Matador WMA702http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/702.pdf
Lower Neches WMA - Old River Unit728http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/728.pdf
Abilene SP1001http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1001.pdf
Old Sabine Bottom WMA732http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/732.pdf
Caddo National Grassland - Bois D'Arc901http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/901N.pdf
Muse WMA750http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/750.pdf
J. D. Murphree WMA - Salt Bayou Unit783http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/783S.pdf
Sea Rim SP1055http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1055.pdf
Las Palomas WMA - Anacua Unit744http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/744.pdf
J. D. Murphree WMA - Big Hill Unit783http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/783N.pdf
Property Boundary513http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/513.pdf
Mad Island WMA729http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/729.pdf
Caddo Lake WMA730http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/730.pdf
Marlin Lake PHL2547http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/???.pdf
Nannie M. Stringfellow WMA716http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/716.pdf
Muse WMA750http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/750.pdf
Caddo Lake NWR509http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/509.pdf
Somerville PHL/Trailway Flag Pond-Lk Somerville SP711http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/711_1121.pdf
Lower Neches WMA - Nelda Stark Unit738http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/738.pdf
Chaparral WMA700http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/700.pdf
Gus Engeling WMA754http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/754.pdf
Las Palomas WMA - Taormina Unit715http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/710_715_718.pdf
Las Palomas WMA - Baird Unit710http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/710_715_718.pdf
Las Palomas WMA - Chapote Unit718http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/710_715_718.pdf
Las Palomas WMA - Longoria Unit741http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/741.pdf
Las Palomas WMA - Carricitos Unit714http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/714.pdf
Las Palomas WMA - Tucker Unit740http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/740.pdf
Guadalupe Delta WMA - Mission Lake Unit720http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/720.pdf
Las Palomas WMA - Arroyo Colorado739http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/739.pdf
Guadalupe Delta WMA-Hynes Bay and Guadalupe River723http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/723_724.pdf
Lower Rio Grande Valley NWR-La Casita East506http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/506.pdf
James E. Daughtrey WMA713http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/713.pdf
Bois d' Arc Lake PHL602http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/602.pdf
Pat Mayse WMA705http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/705.pdf
Mason Mountain WMA749http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/749.pdf
Cooper SP-South Sulphur Unit1155http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/1155.pdf
Bannister WMA903http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/903.pdf
Moore Plantation WMA902http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/902.pdf
Tony Houseman/Blue Elbow Swamp WMA712http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/712.pdf
Twin Buttes PHL502http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/502.pdf
Sam Houston National Forest WMA905http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/905.pdf
Lake Striker PHL601http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/601.pdf
Roger R. Fawcett WMA781http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/781.pdf
Caddo National Grassland - Ladonia901http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/901S.pdf
Tawakoni WMA708http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/708.pdf
Cooper WMA731http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/731.pdf
Big Lake Bottom WMA733http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/733.pdf
Ray Roberts PHL501http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/501.pdf
Justin Hurst WMA-North Stringfellow Unit721http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/721D.pdf
Las Palomas WMA - Ebony Unit719http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/719.pdf
Richland Creek WMA - Trinity Unit703http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/703S.pdf
Richland Creek WMA - Carl Frentress Unit703http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/703.pdf
Teacup Mountain WMA785http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/785.pdf
Justin Hurst WMA721http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/721.pdf

This data should allow me to map each unit to a specific PDF that is authoritative for it. There are a few issues with this. At least some unit numbers show up twice, like 747. Other links are things like http://tpwd.texas.gov/huntwild/hunt/public/annual_public_hunting/resources/???.pdf which are obviously not valid. To try and cleanup this I converted all the links to https:// protocol and then for ones ending in ???.pdf I just substituted the unit number as based off the valid ones links end with 67.pdf for Unit 67 and similar.

This unfortunately did not get me very far. I think this KML document was generated, uploaded to the website for download, and then used for the interactive map. At some point someone noticed the interactive map was not right and fixed it. But they did not fix the publicly downloadable KML file. I ended up just looking up the URLs that were invalid in the interactive map for Unit numbers 501, 703, 513, 601, 729, 730, 731, 733, & 905. This got me to the correct PDFs, which meant I didn't really need this KML file any longer.

Parsing PDFs

Despite using PDFs for decades at this point, I've never once had a reason to understand the format. Given the little shop of horrors that is the PDF format I have no intention of understanding it either. I spent some time using the Python module pypdf to try and extract the text. This did work, but there were spaces injected into the recovered text for some reason. It appears that PDF internally is a series of drawing commands which may include text drawing commands. I moved over to the Python module pymupdf and was able to use the function page.get_text('blocks') to request text be extracted in blocks. The pymupdf module is able to extract the text in long strings like this

328.2430419921875   201.47653198242188  338.1158142089844   211.24362182617188  r   0   0
304.77801513671875  193.11988830566406  324.11358642578125  203.6174774169922   R i v   1   0
324.2057189941406   196.093994140625    333.2583923339844   206.92901611328125  e   2   0
247.56607055664062  148.87429809570312  421.1172790527344   219.11691284179688  GENERAL USE & VISITATION INFORMATION: On-site registration is required for all visitors. General public access to the Area is during daylight hours through designated legal entry points only and only by APH or LPU Permit. Advanced written permission of area manager is required for equestrian use. For more information call (903/389-7080). For more detailed map information, please see aerial photos on our website at http://www.tpwd.texas.gov/huntwild/hunt/.     3   0
497.0801086425781   66.44049835205078   519.759765625   78.44039154052734   Navarro County  4   0
516.3298950195312   93.15247344970703   545.0035400390625   105.1523666381836   Freestone County    5   0
348.79522705078125  70.28425598144531   471.0675964355469   103.83802032470703  Richland Creek WMA Carl Frentress Unit #703N 5,597 acres 

The numbers are coordinates where the text is to be drawn. These I simply ignored. The text on the other hand is the actual text from the documents. After a certain amount of experimentation I realized most of the PDFs contain one or more of the following headings in the boxes

  1. SPECIAL REGULATIONS
  2. LEGAL GAME
  3. Means Restriction
  4. ENTIRE AREA CLOSED
  5. SPECIAL SEASON
  6. GENERAL USE & VISITATION INFORMATION
  7. YOUTH HUNTS
  8. E-POSTCARD SELECTION HUNTS

Since pymupdf is able to present the text in chunks that are in the logical English order what I did was just scan each block as it is output for these headings. When a heading is found the text is split and segmented around the heading. This converts the stream of text into different pieces of information. There is also a bunch of text that is visible on the map which isn't related to the headings. This unfortunately gets attached to a random heading in the document, but there isn't much I can do about that.

Since I was doing this I quickly dumped the output into an HTML file and a CSV file. This gives access to all the different information boxes for each unit in a single document. I thought a while about the ideal way to present this but realized I don't really know at this point. What I decided to do was to present the data and make it traceable to the original PDF since that document is authoritative in all circumstances.

As I worked through this I noticed at least one unit is identified as "Property Boundary", which is obviously not correct. I also found entries like this

<tr>
<td>Comments</td>
<td>Added March 2020 from file Vanessa sent.</td>
</tr>

Apparently the name is just wrong in the top level KML document but correct in the embedded HTML. I just extracted the name from there, since it is always correct. The comments aren't terribly useful but do provide some insight. I compiled all this into a Python script that outputs a CSV and a HTML file. Hopefully someone else finds this data useful as well.

 tpwd_hunting_2025_2026.csv 133.1 kB

The CSV file containing all data

 tpwd_hunting_2025_2026.html 162.5 kB

The HTML file with the regulator data in a table format

 texas_public_hunt_parser.tar.gz 6.1 kB

Python source code I used to generate this


Copyright Eric Urban 2026, or the respective entity where indicated