hello!
i'm working with a dataset in the form of several directories of *.csv (semi-comma separated, actually, for some reason) files, which i would like to process and add to a mysql db. yesterday, mr. data converter came to my attention (praise twitter!) and a few tweaks to the code allowed me to use the semi-comma delimited text, via copy+paste.
this is nice, but i wondered if i could port the code (javascript) to processing so that i could recursively go through the files and add tables. while most of it is working (the syntax at least), a couple of issues are keeping me from actually testing this on the bunch of files i mentioned above.
below is the code i have thus far (apart from SQLibrary and boilerplate code for loading the files), and below that, the questions.
first up...
issue 1
the commented lines return a " the type of the expression must be an array type but it resolved to String" error. if i'm referring to an item/index of a two-dimensional String array, is that not a String itself?
does this have anything to do with using the "+=" operator, instead of "concat()" method in String, or something of the sort? or with using a single string ("outputText") for the whole mysql output, instead of a String array?
issue 2
this is just plain wrong, i know.
how would one go about writing a method in Processing for detecting if a String is a number?
thanks for reading,
bernardo
(edited, for completeness)
i'm working with a dataset in the form of several directories of *.csv (semi-comma separated, actually, for some reason) files, which i would like to process and add to a mysql db. yesterday, mr. data converter came to my attention (praise twitter!) and a few tweaks to the code allowed me to use the semi-comma delimited text, via copy+paste.
this is nice, but i wondered if i could port the code (javascript) to processing so that i could recursively go through the files and add tables. while most of it is working (the syntax at least), a couple of issues are keeping me from actually testing this on the bunch of files i mentioned above.
below is the code i have thus far (apart from SQLibrary and boilerplate code for loading the files), and below that, the questions.
- String newLine = "\n";
- String indent = " ";
- String toMySql(String[] lines, String tableName, String[] columnNames, String[] columnTypes, String indent, String newLine) {
- String outputText = "";
- int rowCount = lines.length;
- int columnCount = columnNames.length;
- //begin render loop
- outputText += "CREATE TABLE "+tableName+" (" + newLine;
- outputText += indent+"id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,"+newLine;
- for (int i=0; i < columnCount; i++) {
- String dataType = "VARCHAR(255)";
- if ((columnTypes[i].equals("int"))||(columnTypes[i].equals("float"))) {
- dataType = columnTypes[i].toUpperCase();
- };
- outputText += indent+""+columnNames[i]+" "+dataType;
- if (i < columnCount - 1) {
- outputText += ",";
- }
- outputText += newLine;
- }
- outputText += ");" + newLine;
- outputText += "INSERT INTO "+tableName+" "+newLine+indent+"(";
- for (int i=0; i < columnCount; i++) {
- outputText += columnNames[i];
- if (i < columnCount - 1) {
- outputText += ",";
- }
- }
- outputText += ") "+newLine+"VALUES "+newLine;
- for (int i=0; i < rowCount; i++) {
- outputText += indent+"(";
- for (int j=0; j < columnCount; j++) {
- if ((columnTypes[j] == "int")||(columnTypes[j] == "float")) {
- // outputText += lines[i][j] || "null";
- }
- else {
- // outputText += "'"+( lines[i][j] || "" )+"'";
- }
- if (j < columnCount - 1) outputText += ",";
- }
- outputText += ")";
- if (i < rowCount - 1) outputText += ","+newLine;
- }
- outputText += ";";
- return outputText;
- }
- String[] getColumnNames(String[] lines, String delimiter) {
- return split(lines[0], delimiter);
- }
- //test columns for number data type
- String[] getColumnTypes(String[] lines, String delimiter) {
- int columnCount = split(lines[0], delimiter).length;
- String[] columnTypes = new String[columnCount];
- int numRowsToTest = columnCount;
- float threshold = 0.5;
- for (int i=0; i < columnCount; i++) {
- int floatCount = 0;
- int intCount = 0;
- for (int j=0; j < numRowsToTest; j++) {
- /*if (CSVParser.isNumber(lines[j][i])) {
- intCount++;
- if (String(lines[j][i]).indexOf(".") > 0) {
- floatCount++;
- }
- }*/
- }
- if ((intCount / numRowsToTest) > threshold) {
- if (floatCount > 0) {
- columnTypes[i] = "float";
- }
- else {
- columnTypes[i] = "int";
- }
- }
- else {
- columnTypes[i] = "string";
- }
- }
- return columnTypes;
- }
- boolean isNumber (String token) {
- /*if( (token.equals(null)) || isNaN( new Number(string) ) ) {
- return false;
- } */
- return true;
- }
first up...
issue 1
- for (int i=0; i < rowCount; i++) {
- outputText += indent+"(";
- for (int j=0; j < columnCount; j++) {
- if ((columnTypes[j] == "int")||(columnTypes[j] == "float")) {
- // HERE!
// outputText += lines[i][j] || "null"; - }
- else {
- // AND HERE!
- // outputText += "'"+( lines[i][j] || "" )+"'";
- }
- if (j < columnCount - 1) outputText += ",";
- }
- outputText += ")";
- if (i < rowCount - 1) outputText += ","+newLine;
- }
the commented lines return a " the type of the expression must be an array type but it resolved to String" error. if i'm referring to an item/index of a two-dimensional String array, is that not a String itself?
does this have anything to do with using the "+=" operator, instead of "concat()" method in String, or something of the sort? or with using a single string ("outputText") for the whole mysql output, instead of a String array?
issue 2
- boolean isNumber (String token) {
- /*if( (token.equals(null)) || isNaN( new Number(string) ) ) {
- return false;
- } */
- return true;
- }
this is just plain wrong, i know.

thanks for reading,
bernardo
(edited, for completeness)
1