Midjourney guide explained how to use

Everything you wanted to know about MidJourney

What is MidJourney, and how does it compare to DALL·E? Learn all about the latest image-to-text AI that creates stunning results.

Summary: MidJourney is a text-to-image AI like DALL·E, with a specialism in ‘pretty’ images. You use it by messaging a bot in the chat app, Discord – it’s not a web application, but it doesn’t require any programming either. It’s free to try, or costs $10/month for a basic plan after that. It’s really good!

What is MidJourney?

Just like DALL·E 2 or Craiyon (formerly DALL·E Mini), MidJourney is a text-to-image AI that generates gorgeous visuals based on your text prompts.

While DALL·E is designed to generate anything you can imagine – including the mundane or ugly -Midjourney has a bias towards creating painterly, aesthetically-pleasing images by default. Given the choice, MidJourney prefers to create images with complimentary colours, artistic use of light and shadow, sharp details, and composition with satisfying symmetry or perspective.

In the words of its founder, “we just want it to be easy to use – and we want the pictures to look good.”

So, let’s look at how that works in practice!

How do you use MidJourney?

Unlike DALL·E 2 or DALL·E Mini, Midjourney doesn’t work as a web app. Nor is there any coding required.

Instead, you use it in a popular chat app called Discord, by ‘talking to’ a bot. (Discord, for those unfamiliar, is very similar to Slack, and just like Slack, it works both in your browser and in a standalone app, on both desktop and mobile. )

As a free/trial user, this is especially chaotic, because you interact with the bot in a busy public chatroom, where everyone else is ALSO doing the same thing!

A timelapse of users prompting the bot in the public channel.

However, it’s also a pretty fun way to get started: you can see the exact prompts everyone else is trying – and their results – live and unfiltered!

Once you become a paid user (from just $10 a month) you’ll be able to DM the bot in a private conversation, making for a far calmer experience. (You can still keep an eye on the public channels to see what’s going on, or participate in challenges, if you’re feeling nosy.)

As you’d expect, generating images consists of typing in a sentence and seeing what happens.

Step one: prompting DALL·E

Generating images takes about 2x longer than DALL·E – it takes about 50 seconds to reveal your initial thumbnails.

However, unlike DALL·E, you can watch as the AI gradually generates your images, from initial blurry colours, to high-definition thumbnails. It’s rather hypnotic, and only increases the sense of delicious anticipation.

Unlike DALL·E, there’s then a second step: the initial thumbnails are only 256px (though you can download all four as a grid!) at which point you can upscale your favourite to full quality, which costs another credit.

This upscaling process takes an additional minute or so to complete – you’ll also be able to watch it happen as Midjourney fills in the details.

Just like DALL·E, you can also create ‘variations’ of any MJ image where you’d like to see some similar outputs. (Note, this is only possible only using the UI beneath a previous generation – you can’t upload your own image to vary.)

Requesting a variation generates four options, rather than DALL-E’s three.

Interestingly, variations are less computationally intensive than generating an image from text – on a $10 Basic Plan, for example, you could generate 666 variations (for 2664 total thumbnails) , but only 200 text prompts.

What kind of images does MidJourney create?

Here’s the most important thing to know – if you’ve heard of Midjourney, but aren’t that familiar with it, it might be a lot more capable than you’re aware of. Previously, Midjourney has been heavily associated with a certain kind of image – highly ornate, almost ‘fractal’ pictures.

Although they’re certainly very striking, they share a certain alien style that gives them that signature ‘AI generated look.’ But Midjourney’s model is now on ‘version 3’ – and it’s more powerful and adaptable than ever.

Not photos: but photorealistic

Midjourney avoids generating straight-up ‘everyday’ photography, but it can create beautiful images that are 90% or more of the way there: like a very detailed piece of art, a convincing piece of CGI, or perhaps a real photograph that’s been heavily Photoshopped to create an artistic effect.

All images generated by Midjourney, sourced from the community gallery: phishnchips, richardhendricks, RankSquid, urens, Danger

In particular, Midjourney’s tendency to generate ultra-sharp images adds to the overall feeling of veracity.

Prompt-wise, Midjourney understands all the technical photography jargon you’d expect from using DALL·E (or consulting the Prompt Book), happily recreating the implied vibe you’re after, whether that’s a particular lens, film stock or lighting setup.

Creating 2D art, painting and illustrations with MidJourney

Just like DALL·E, MidJourney is more than capable of taking inspiration from a wide variety of different art mediums, styles, and historic artists. From ornate ‘paintings’ to blocky, abstract illustrations, analogue sketches to digital

Mocking up 3D visual art with MidJourney

Finally, like DALL·E, MidJourney can be used to develop convincing concept art for 3D artwork. The aforementioned ultra-sharp photography style really shines at any scale, whether it’s monumental interiors or intricate details.

Midjourney vs DALL·E: visual strengths and weaknesses

Head-to-head with DALL·E, MidJourney is often more aesthetically pleasing. MidJourney is essentially built to be ‘pretty by default’, so even for vaguely defined prompts, it delivers much more reliably ‘aesthetic’ images. Here’s ‘girl discovers meaning of life’ for example:

The drawback is, even given more specific prompts, Midjourney has a tendency to ignore the requested style in favour of something ‘better looking.’

Challenged to illustrate ‘anxious thoughts’ in an ‘outsider art’ style, we can see that DALL-E’s result is much closer to the scribbly aesthetic we’d stereotypically expect, while Midjourney prefers to deliver something more composed and less ‘outsider-y.’

Similarly, asked to generate a ‘pixel art pineapple’, Midjourney can’t resist adding curves, details or anthropomorphic features, while DALL·E follows instructions to the letter:

For the casual user, this bias can actually be an advantage. Midjourney is a fantastic option for rapidly generating a coherent set of imagery, like stock illustrations to accompany a series of articles, as long as you’re happy for it to call the aesthetic shots.

Let’s imagine we’re working on a project dealing with ‘mental health in the workplace.’ From a single prompt, MidJourney’s generations are all very consistent, so on our homepage, they’d go together really well!

‘A business man thinking business thoughts, Futurism’ – generated by Midjourney

Even across different prompts, there are enduring aesthetic similarities which wouldn’t be miles apart stylistically.

Now compare the above images DALL·E’s responses to the same prompts: they are much more wide-ranging, as if we’d asked different designers to create each image, or they were taken from different magazines.

With more styles to choose from, that means we’re more likely to find what we’re looking for – but they don’t go particularly well together. And if we ran a similar prompt again, we might see something very different again!

Alas, not only does MidJourney’s model make your results consistent, it’s making everyone’s results consistent. Below, we can see an image I recently generated and shared on Twitter. I was struck by the creative response to the prompt ‘last thoughts of a dying man’- until two other people in the community shared recent images they’d made…

Clearly, the model has some crowd-pleasing techniques up its sleeve, ready to satisfy a whole range of requests. (Is… the model… hacking us?)

As MidJourney images are used more widely, these aesthetics might start to feel ‘played out.’ But on the other hand, Midjourney might develop new tricks before that happens. Or perhaps we’ll always find a figure in front of a portal irresistible…

In short, the challenge with MidJourney is no in creating a beautiful image – that’s remarkably easy – but to push the model to create a distinct image with a novel style.

Content rules and limitations

While still precluding adult imagery of gory violence and sexual content, Midjourney has far fewer content limitations than DALL·E. There are no rules against creating images that depict ‘violence’ broadly or fantastically, illness and disease, political content, or depictions of public figures.

Here are some examples of images that would be blocked by DALL·E, for example:


‘Gun’ (violence), ‘coronavirus’ (illness + health), ‘Trump’ (politics) and ‘Taylor Swift’ (public figure) are all against the DALL·E content policy.

So, if you’re trying to design sci-fi super-soldiers, create illustrative work to accompany health content, design punchy political images, or just generate some good old-fashioned fan art, you’ll be wanting to choose MidJourney every time.

The examples above also illustrate just how censorious DALL·E’s restrictions currently are. (In OpenAI’s defence, because MidJourney doesn’t generate photorealistic images, there is a reduced risk of misuse through fakery and misinformation.)

Everything is saved – and everything is public

DALL·E only makes your 50 most recent generations recoverable, unless you manually save individual images, which are then stored in a big, unsearchable bucket. (Something of a pet peeve!)

But MidJourney saves every thumbnail and HD upscale by default, in your own personal archive, which you can also search by prompt. That makes it impossible to lose any work, and easy to find it again. Much better!

View of one’s personal archive.

One potential downside for the privacy-minded: by default, every prompt and generation is public, shared in a mighty gallery. (Currently, only paid members can browse this content.) And of course, if you’re a free/trial user, you’ll be posting your prompts directly in a public chatroom. If you want to opt out – perhaps you’re working on a commercial project – it’ll cost you $50/month.

Exploring prompts with ‘strawberry’ in them.

However, the benefits of being able to consult the huge archive are huge: it’s a brilliant way to investigate what styles exist, and stumble across new ideas constantly. For instance, you could search for techniques (like ‘pastels’), artists (like ‘da Vinci’), or subjects (like ‘sneakers’). You can also bookmark others’ work for later reference, and of course, time spent exploring the archive of prior creations means you’ll burn through your job credits a little slower.

Other MidJourney-only techniques

MidJourney offers a few clever features that DALL·E doesn’t.

Creating landscape and portrait images

This is super-straightforward, you just add something like the following to the prompt: –ar 2:1, to create an image that’s twice as wide as it is tall. e.g:

Dark oil painting of a horse king --ar 2:1

Some common aspect ratios are are:

16:9 (widescreen desktop, phone in landscape)
9:16 (vertical phone, Instagram story)
4:3 (typical ‘thumbnail’ or landscape photo)
4:5 (the most ‘portrait’ an image can be in the Instagram feed)

Landscape images are great for panoramic views, of course, but also for anything ‘cinematic’, including action scenes! Portrait compositions work well for posters, full-body portraits and vertiginous subjects like skyscrapers.

Panoramic landscape (by spacef)
Cinematic widescreen still (by Jas)

Adding ‘image prompts’

Just start the prompt with the URL of an image, and Midjourney will attempt to use ‘the style’ of that image to influence the output. (Note this is not the same as varying it or ‘editing it’ directly –it will not take the subject of the image and apply your prompt to it.)

For example, if we start with the image on the left, with the word ‘dog’ attached, we get the following:

In my experience, this is pretty difficult to control! It seems easier to just prompt with text. That said, with perseverance you might get some interesting effects, and it would be interesting to document the results – there is not much published study of this feature.

Light upscaling

This is easier to visualise than explain. Basically, The default ‘regular upscaling’ adds detail to the HD version, so a painting of a field of flowers will contain more blades of grass, petals and so on.

Conversely, ‘light upscaling’ will keep the same number of brushstrokes and simply makes the existing detail bigger.

Left: light upscaling. Right: regular upscaling.

Why Midjourney is better than DALL-E

Appearance: Midjourney’s images are usually more aesthetically pleasing than DALL·E, and the model is still adaptable and responsive to stylistic prompts. That makes it a great tool if you want to generate a lot of pleasant images quickly – like your own archive of stylised ‘stock illustrations’ to accompany articles, videos or intangible services – without needing to fine-tune the prompt too closely.

Ease of use: With your choice of aspect ratio, you can transform an idea to a Instagram story, Youtube thumbnail or book cover much faster. And the public gallery, with hundreds of thousands of images to be inspired by, makes it easy to check if your prompt will be understood before you type it in, and find thousands of new ideas ideas.

Managing your work: With automatic archiving, it’s impossible to lose the images you’ve paid to make. It also saves all your thumbnails in a 2×2 grid, so you don’t need to worry about saving every image individually.

Mobile use: The Discord app works much better on mobile than the DALL·E website, making it easier to generate on-the-go (or in bed, for that crazy 3am prompt idea you just dreamed about.)

Visible generation process: The animated generation process is not just fun to watch, it also makes Midjourney better for workshops or presentations where you’d otherwise experience ‘dead air’ as DALL-E’s grey progress bar trundles across the screen.

Unlimited use: For $30/month, MidJourney offers an all-you-can-eat mode, although eventually your speed of generation will be slowed down.

No watermark: unlike DALL·E, the bottom 20px of each image is unadorned by a ‘signature.’

Cheaper, if you want one ‘hero’ output per prompt: If you’re likely to only one want one / ‘the best’ output per prompt, then Midjourney gives you 100 prompts (and 100 upscales)for $10, or 10¢ per image. (DALL·E gives you 115 prompts for $15 which is 13¢ per image.)

Why DALL·E is better than Midjourney

Flexibility: Trained on (one assumes) a much larger range of images, DALL·E is capable of delivering a wider range of visual styles.

Uniqueness: As a result, you’re much more likely to craft a surprising or amusing result, or create a never-before-seen image. It’s less likely the image will seem ‘AI-like’. You can also feel more confident that other users haven’t generated very similar images.

Responsiveness: Because DALL-E’s model is less opinionated, it’s more responsive to style prompts, especially if that style is less immediately beautiful – it won’t try to overrule your opinion. Therefore, you’re more likely to get a precise reaction to a specific request, like pixel art.

Raw speed: From initial prompt to full HD download takes as little 20 seconds, vs Midjourney’s 120.

Photography: DALL·E is also more adept at creating realistic, ‘normal’ photographs that wouldn’t be out of place in a magazine or corporate website.

Image editing: DALL·E also offers powerful tools that Midjourney doesn’t: in-painting, uncropping, and varying image uploads are crucial to the more inventive uses of AI art we’re currently seeing.

A real web app: With its own minimal web interface, you can work directly with DALL·E – the confusion of installing Discord is likely to be off-putting to some kinds of users.

Privacy: With all your generations private by default, it’s better for those working on personal or sensitive commercial projects.

Sharing URLs: DALL·E lets you turn a creation into a shareable URL, so it’s easy to send a link.

Cheaper if you want bulk images: if you might hypothetically need multiple images from the same prompt (e.g: all four photos of ‘a lemur on a lemon’ for an upcoming… OK, I have no idea) then $15 buys you 115 prompts x 4 generations = 460 HD images, for 3.3c/image. In comparison, Midjourney’s $10 basic plan gives you roughly 200 ‘jobs’, but prompting and upscaling each cost 1 job – the maximally efficient ratio would be for 40 prompts and 160 upscales, or 6.25c/image.

Why not both?

One powerful technique is to create original images in Midjourney, then import them to DALL·E for variation and in-painting. And there are other great image-editing tools that work well with both.

DALL·E vs Midjourney, head-to-head

TraitDALL·EMidjourney
Works as a…Web appChatbot in Discord
Banned topicsAdult imagery (gory violence, sexual imagery), any & all violence, public figures and celebrities, health or illness-related themes, political topicsAdult imagery (gory violence, sexual imagery)
Outputs as4 x 1024×1024 PNGs per prompt4 x 256px thumnails per prompt, then upscale to any aspect ratio (1024sq is default)
Variations✅ x 3, from any prompt image (except photo-like human faces)✅ x 4, but only from previous generations (can’t upload)
Edits, inpainting, uncropping✅ Yes❌ No
Time-to-thumbnail🚀 20 sec🚘 53 sec
Time-to-full-res-download🚀 20 sec (as above – all thumbnails are already full-res)🚘 2m 38s
Landscape & portrait images❌ No, only with stitching✅ Yes
Photorealistic images✅ Yes🤷🏻‍♂️ Nearly!
‘Image + text’ combination prompts✅ Yes, with inpainting✅ Sort of: as seed images
Personal archive⚠️ Limited: Last 50 generations (200 images) – or manually save individual images to collection (up to 10,000)Total: Every thumbnail and HD generation you create is stored and searchable
Public archive❌ No public archive, DALL·E outputs are not searchable✅ Yes: paid users can explore generations and prompts from other users
Privacy🙈 All generations + prompts are private by default (OpenAI can inspect prompts for safety + moderation), manual share-to-web option🔦 All generations + prompts are public (to other paid users) by default (unless you get a private/corporate account for +$50/month)
CostCredits-based: $15 USD for 115 prompts (aka 460 HD images) (3¢/image), wheneverSubscription-based, with limits: $10 month (limit of approx. 200 jobs (either prompting or upscaling), $30 (unlimited but slows do

Ready to try it? Visit midjourney.com to get started.

Zeen is a next generation WordPress theme. It’s powerful, beautifully designed and comes with everything you need to engage your visitors and increase conversions.

FOLL-O us