openprocessing.org changes filenames when uploaded, eg. drops + and changes multiple . to _

edited January 2018 in p5.js

Hi, not sure best category for this post, maybe there should be a new openprocessing.org category.

I've noticed when uploading image files to openprocessing.org, in p5.js mode, that some changes happen. Any + (plus) signs in the filename get deleted, so image+extra.jpg becomes imageextra.jpg. Also multiple . (dot) become _ (underscore), thus image.icon.jpg becomes image_icon.jpg. There's some tiny logic there, in that multiple dotted components in a filename can be troublesome, and + signs in filenames are also asking for trouble a little bit, but it's unexpected. Anyone have any thoughts on that? (I've solved my problem, just accepted what the site does). I could ask Sinan via my Plus membership, but I thought I'd cast wider on this one, in case it's a known type of filename restriction policy. (Well, if no-one asks the trivia, it just rolls on forever, right ?)

Cheers, GE.

Answers

  • This could be generic to uploading files to a web platform -- that is, openprocessing.org might not have chosen these substitutions directly, it may be using a standard library that follows security and access best-practices.

    .: OWASP has a page on file uploads that specifically discusses the ways that bad things can slip through if you let users use multiple . characters in filenames:

    +: this indicates a blank space in URL encoding, so names with plus could be tricky to round-trip through URLs without special handling. If you are uploading files that are meant to be used over the internet, you could URL-encode the filename, but now you are letting people write % into names in filespace -- simpler to just get rid of it. There is an example discussion of dealing with such issues here:

  • Many thanks Jeremy. That's a great bunch of info. It's bringing back forgotten issues in URLs for me, like the + flagging a space. I think should gratefully accept what OpenProcessing is doing rather than squawk about it. I can live with [a-zA-Z0-9]{1,200}.[a-zA-Z0-9]{1,10} as owasp indicates. Maybe also with underscores as separators. Who would have thought 30 years ago we'd have to be so fanatically careful. Cheers, Greg E.

  • edited January 2018

    @grege2 -- Glad that was helpful!

    I think you are right. I personally wasn't doing anything with markup or GCI until ~`94, but I think whenever things get really complex (in almost any area of computing) then being fastidious about your inputs became a virtue.

    Right now I'm working on a Processing.py pre-processor that runs, passes its output through Markdown, which converts it to HTML, which is modified by jQuery+SASS in the browser. Throwing a "#" or "//" or "/*" into the pipeline and trying to get it through the entire chain without very weird things happening is... challenging.

Sign In or Register to comment.