We are about to switch to a new forum software. Until then we have removed the registration on this forum.
Hi, I found some code in the old forum which I'm trying to modify to pull in image URL from a webpage. In the original code it is matching an image tag in the form of
<IMG SRC="image/1207/AR1520_071112friedman900.jpg"
Using
// Optional spaces, the IMG tag and its attribute, capture of the URL, anything after it
Pattern pat = Pattern.compile("\\s*<IMG SRC=\"(image/.*?)\".*");
However I want to match/extract a tag in the form of
<a class="link" href="GOPR0237.JPG">GOPR0237.JPG</a>
(where the 4 digits between "GOPR" and ".JPG" are unknown, ultimately it will be compiled into the url
"http://10.5.5.9:8080/DCIM/100GOPRO/GOPRxxxx.JPG" so the image can be downloaded.
This is the current full code if anyone's interested
import processing.net.*;
import java.util.regex.*;
import java.util.*;
import java.text.*;
Client c;
String URL_BASE = "http://10.5.5.9:8080/DCIM/100GOPRO/"; // Where the Gopro stores it's images
PImage GoProImage;
Pattern pat = Pattern.compile("\\s*href=\"(GOPR.*?)\".*"); //NOT CURRENTLY WORKING!
void setup()
{
size(4000, 3000);
background(50);
fill(200);
DateFormat df = new SimpleDateFormat("yyyy-MM-dd");
String imageName = "GoPro-" + df.format(new Date()) + ".jpg";
c=new Client(this, "10.5.5.9", 80);
c.write("GET /bacpac/SH?t=oakh6214&p=%01 HTTP/1.0\r\n");
c.write("\r\n");
String url = findImageURL(URL_BASE);
print(url);
if (url != null) // Found
{
GoProImage = loadCachedImage(imageName, URL_BASE + url);
}
}
void draw()
{
image(GoProImage, 0, 0);
}
String findImageURL(String pageURL)
{
String url = null;
String[] lines = loadStrings(pageURL);
for (String line : lines)
{
Matcher m = pat.matcher(line);
if (m.matches())
{
url = m.group(1);
break;
}
}
return url;
}
PImage loadCachedImage(String fileName, String url)
{
PImage img = loadImage(fileName);
if (img == null) // Not downloaded yet
{
img = loadImage(url);
if (img != null)
{
img.save(fileName); // Cache of the file
} else
{
println("Unable to load the image from " + url);
exit();
}
}
return img;
}
Answers
FFS! THe original code had
"Pattern pat = Pattern.compile("\\s*href=\"(GOPR.?)\".*"); //REGEX NOT CURRENTLY WORKING!"
But for some reason a "\" and "*" got strppped out in pasteing to the forum seems you need to type three slashes for two to appear in the forum ?!?!?
EDIT, thanks, now sorted it
https://forum.Processing.org/two/discussion/15473/readme-how-to-format-code-and-text
@MFX -- have you tried testing your regex and your sample match data using an online interactive regex testing tool such as regex101 or regexbuddy?
Is it working, or is the problem with the regex pattern itself rather than the Processing code?
if this matches then it'll set the 1st pattern to the filename GOPRxxxx.jpg
assumes only digits between the GOPR and the .JPG. and all uppercase.
Thanks all I've decided to start from scratch and approach it differently rather than try and use someone elses code, this works as a starting point
Ignore the "target="_blank" rel="nofollow">http://10.5.5.9:8080/DCIM/100GOPRO/");" Bit, the forum inserted that.
ah, ok, that works in bash, not yet in java. give me a minute...
oh, you don't need to escape the grouping brackets in java