The thing with 3D is that shapes get most of their realism from
lighting. You might consider replacing sphere(1) with a call to box(1), and then learning a bit about lighting in Processing.
We also get an understanding of a 3D shape by looking at how it
changes in space over time; with 3D scenes it's easiest to do this by adding a rotateX/Y/Z that depends on the current time at the top of draw(). Unfortunately, Mandelbulb is very computationally expensive, meaning you'll get <1 frame per second without optimizations - this is far too much time for our brains to recognize the shape being drawn.
You can get around this in a couple ways:
a) render each frame to an image on file and then stitch them together into a movie. This might be the most straightforward way, but it could take LOTS of time.
b) Do some optimization - for instance, if you can calculate the 3D point's position on your screen and see that another box has already been drawn "in front" of it, you don't have to iterate that point at all, since it'll be hidden behind the other box regardless.