Prompt Engineering Essentials for 2024

Jeremiah Moore
5 min readFeb 5, 2024

--

image from Dalle-3 — bing.com/create

You will use AI more than ever this year. It may write your emails, help design a logo for you, or even appear in films and other media you consume. Communicating with Large Language Models (LLMs) such as ChatGPT will become a part of every day, which is why you need to know the basics of Prompt Engineering to effectively use AI technology.

Prompt engineering has emerged as a crucial skill, enabling users to communicate more effectively with AI to generate desired outputs, whether in text or image form. This article explores essential techniques in text-to-text and text-to-image prompt engineering, providing practical examples. Practice these techniques and you will be on your way to being a Prompt Engineer.

Text-to-Text Prompt Engineering Techniques

1. Clarify and Specify

Technique: The key to effective text-to-text prompt engineering is clarity and specificity in your questions or commands. Be as clear and detailed as possible.

Example: Instead of asking, “How do I cook rice?”, specify, “What is the step-by-step method to cook basmati rice for two people using a rice cooker?”

The type of rice, portion size, and cooking method are mentioned which will give a much better response from the LLM. Try using the ‘step-by-step’ method to prompt the LLM to output a sequence of steps or instructions.

2. Context

Technique: Providing contextual information helps LLMs understand the nuance of the request. Embedding contextual cues can significantly alter and improve the response quality.

Example: “Given that I’m a beginner in Python, how can I write a function to sort a list of numbers in ascending order?”

By mentioning your beginner status, it prompts the LLM to give you a basic solution in regards to your skill level.

3. Prompt Chaining

Technique: Sequentially building upon previous responses, or prompt chaining, can lead to more complex and comprehensive answers.

Example: First ask, “What are the main causes of climate change?” Follow up with, “Based on these causes, what are effective mitigation strategies?” Most LLMs have a context window, meaning it will recall previous parts of the conversation when asked. This gives LLMs like ChatGPT a conversational feel.

You: { initial question }
LLM: { broad strategies answer }

You: "Based on { broad strategies answer } what ... ?"
LLM: ... { refined answer }

4. Iterative Refinement

Technique: Iterative refinement involves progressively tweaking and refining your prompt based on the responses received. This approach allows for a more dynamic interaction with the model, where each response is used to adjust the subsequent prompt for clarity, depth, or specificity. It’s particularly useful for exploratory topics or when seeking to refine ideas or concepts.

Example: If your initial question is “What are innovative ways to reduce carbon footprint?”, and the response covers broad strategies, you can refine the prompt iteratively like, “Of the strategies mentioned, which can be implemented at a community level with low cost?” Based on the model’s answer, you can further refine, “Please provide a step-by-step plan for implementing solar panel initiatives in a small community.” Plan out your questions and commands this way in advance, and then use iteration to refine the output. You will find that this method can give much better results than trying to prompt too many instructions at once, which can confuse the LLM.

You: { initial question }
LLM: ... { response covering broad strategies }

You: { refining question based on LLM response }
LLM: ... { refined response }

You: { refining question based on LLM response }
LLM: ... { further refined response }

(iteration)

Text-to-Image Prompt Engineering Techniques

1. Descriptive Precision

Technique: When generating images, the precision of your description directly influences the output. Detailed descriptions of the scene, subjects, colors, and mood lead to more accurate and satisfying results.

Example: “Serene mountain landscape at sunset, featuring a clear lake in the foreground, surrounded by tall pine trees, with a backdrop of snow-capped mountains under a gradient orange sky.”

Generate an image of a serene mountain landscape at sunset, featuring a clear lake in the foreground, surrounded by tall pine trees, with a backdrop of snow-capped mountains under a gradient orange sky.
image generated by Dall-e 3

2. Style and Artistic Influence

Technique: Specifying an artistic style or influence can guide the AI in generating images that reflect a particular aesthetic or era.

Example: “Create an image in the style of Impressionism depicting a busy Paris street scene in the early 1900s, focusing on lively street cafes and pedestrians in period attire.”

Create an image in the style of Impressionism depicting a busy Paris street scene in the early 1900s, focusing on lively street cafes and pedestrians in period attire.
image generated by Dall-e 3

3. Composition and Perspective

Technique: Directing the AI on composition and perspective can dramatically affect the outcome, enabling the creation of more dynamic and engaging images.

Example: “Design an image from a bird’s-eye view showing a medieval castle surrounded by a dense forest, with a winding river to the east.”

Design an image from a bird’s-eye view showing a medieval castle surrounded by a dense forest, with a winding river to the east.
image generated by Dall-e 3

4. Emotion and Atmosphere

Technique: Incorporating emotional cues or atmospheric elements into your prompt can elicit images that evoke a specific feeling or mood.

Example: “A cozy, dimly lit library with a fireplace, filled with shelves of old books, a comfortable leather armchair, and a sleeping cat, conveying a sense of warmth and tranquility.”

Generate an image of a cozy, dimly lit library with a fireplace, filled with shelves of old books, a comfortable leather armchair, and a sleeping cat, conveying a sense of warmth and tranquility.
image generated by Dall-e 3

5. Cross-modal References

Technique: Cross-modal reference involves using descriptions from one sensory modality to influence the creation in another, such as incorporating sound or tactile sensations into visual prompts. This technique can encourage the AI to generate images that not only capture the visual aspect but also convey a sense of the other senses, creating a more immersive and richly detailed image.

Example: “Bustling city street that visually echoes the cacophony of sounds typical of a rainy day, with people bustling under colorful umbrellas, reflective wet pavements, and the grey, overcast sky, making one almost ‘hear’ the raindrops and the murmur of the crowd.” This prompt asks the AI to translate the auditory experience of a rainy day into a visual depiction that evokes the same atmosphere and mood.

Bustling city street that visually echoes the cacophony of sounds typical of a rainy day, with people bustling under colorful umbrellas, reflective wet pavements, and the grey, overcast sky, making one almost ‘hear’ the raindrops and the murmur of the crowd.
image generated by Dall-e 3

Conclusion

As we navigate through 2024, the art of prompt engineering for both text-to-text and text-to-image applications in LLMs continues to evolve. Mastering these techniques not only enhances our ability to interact with AI but also opens up new avenues for creativity, research, and problem-solving. By applying these strategies, users can leverage the full potential of LLMs, transforming vague ideas into precise outcomes and vivid visualizations. As the field progresses, staying updated with the latest practices and experimenting with prompts will be key to unlocking the myriad possibilities that AI offers.

--

--

Jeremiah Moore

Developer, Creator, Artist, and Human Being living in the United States