I will be the first to say that you shouldn’t over rely on tools like ChatGPT, particularly for programming. I could write a whole post about that. Today, I want to focus on where I think it excels incredibly in many research and tinker contexts: strange one-off programming tasks.
The task I had to complete: converting Unicode characters within multiple key-value mappings into their Unicode hex representations. This isn’t really a normal sort of task where it makes sense to optimize.
The Example
The problem involved input data like this. The aim was to quickly be able to type in a ππ π π ππ ππ₯ ππππ π₯πππ€ ππ ππ¦π₯π ππ π₯πππͺ π¨ππ₯π π π’π¦ππππ€π¨ππ₯ππππ£. However, I realised that autohotkey couldn’t read the unicode literals, and I had to convert them to their hexadecimal representation.
monospaceCharMap := {"0":"πΆ","1":"π·","a":"π","b":"π"}
boldCharMap := {"0":"π","1":"π","a":"π","b":"π"}`
- Input: A text file containing multiple mappings.
monospaceCharMap := {"0":"πΆ","1":"π·","a":"π","b":"π"}
- Output: Replace the second value in each mapping (symbols) with their corresponding Unicode hex values.
monospaceCharMap := {"0":"U+1D7CE","1":"U+1D7CF","a":"U+1D68A","b":"U+1D68B"}
β οΈ It is important to note that my aim was specifically to have the output be literally what you see above, so I could copy and paste that back into my autohotkey script.
How ChatGPT Sorted It
In reality, I had 10 dictionary-like maps. It would have been a pain to manually do this. Even just thinking about the regex was giving me a headache for something that was meant to be for fun.
ChatGPT wrote up a Python script that:
- Parses multiple key-value mappings using regular expressions.
- Converts single-character values to their Unicode hex format using the
ord()
function and04X
formatting. - Reconstructs the updated mappings and outputs them to a new file.
Hereβs the complete script:
import re
# Read the input file
with open("text.txt", "r", encoding="utf-8") as file:
content = file.read()
# Regular expression to match each map and its key-value pairs
map_pattern = r'([a-zA-Z]+CharMap) := \{([^}]*)\}'
key_value_pattern = r'"([^"]+)":"([^"]+)"'
# Process each map
updated_maps = []
for map_match in re.finditer(map_pattern, content):
map_name = map_match.group(1)
key_value_string = map_match.group(2)
# Process the key-value pairs
updated_pairs = [
f'"{key}":"U+{ord(value):04X}"' if len(value) == 1 else f'"{key}":"{value}"'
for key, value in re.findall(key_value_pattern, key_value_string)
]
# Reconstruct the map
updated_map = f'{map_name} := {{{",".join(updated_pairs)}}}'
updated_maps.append(updated_map)
# Combine all updated maps into the final result
updated_content = "\n".join(updated_maps)
# Save the updated content to a file
with open("output.txt", "w", encoding="utf-8") as outfile:
outfile.write(updated_content)
print("Conversion complete! Output saved to 'output.txt'.")
Code Breakdown
-
Regex Matching:
map_pattern
identifies each map and its key-value pairs.key_value_pattern
extracts the individual"key":"value"
pairs.
-
Unicode Conversion:
ord(value)
gets the Unicode code point for a character.f"{ord(value):04X}"
formats the code point as a 4-character uppercase hexadecimal (e.g.,U+1D7CE
).if len(value) == 1
ensures that only symbols (single characters) are processed, not plain text values.
-
Reconstruction:
- The updated key-value pairs are joined using
",".join()
to rebuild the map. - Each updated map is stored in a list and written to the output file.
- The updated key-value pairs are joined using
What Features Made this a Good Task
- This was a one off strange task where I would not be re-using the code base. It doesn’t matter if the code is not optimal
- There would have been little value in me spending the time to debug the regex myself
- It is extremely easy for me to debug the code and modify to meet my needs, so I’m not fighting against the LLM.
Conclusion
Overall, this was a pretty quick and clean solution to an annoying task. ChatGPT even helped prime a few sections of this blog (I take full credit that I’m sitting here writing in my voice though). When I have an extremely discrete regex task, ChatGPT is godlike.
To anybody interested I will make a post describing the AutoHotkey script that converts keyboard input into specialized character maps such as monospace, bold italic, bold sans, cursive, double-struck, medieval, and italic. It supports both uppercase and lowercase characters by detecting the Shift
.