These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Database (MySQL) data string (via ADBC) returns in Chinese/Japanese characters, why?!

fixmedia
Registered: Nov 15 2010
Posts: 17
Answered

Dear acrobat scripters,
 
I have a Acrobat 9 Pro javascript in a test form which retrieves all database tables ordered by creation date. The result of this query feeds a combobox within the form.
 
For some strange reason the table names are returned within acrobat in strange Chinese/Japanese characters and I’m clueless why that is. Previous to this I had the same thing with the values of a column which showed the same Chinese/Japanese characters in a text field. I solved that problem by changing the column collation from utf8_general_ci to latin1_swedish_ci. Don’t ask me why this solved the problem.
 
If I run the (MySQL) query within my phpadmin everything’s fine.
Anyone a suggestion?
 
Database: MySQL 5.1.36
MySQL character set: UTF-8 Unicode (utf8)
 
PS. I know ADBC is dropped in Acrobat X.
 
*My testcode*
var cb = this.getField("combobox_projectnr");
var adbcConnection;
var adbcConnectionStatement;
var tableArray = new Array;
var sortedArray = new Array();
 
if(formDebug == true) {
//this.getField("stickernr_input").value = "";
//this.getField("combobox_bouwlaag").value = "Maak een selectie";
populateCombobox();
}
else {
populateCombobox();
}
 
function populateCombobox() {
try {
adbcConnection = ADBC.newConnection("pdfTest","root","");
//
if (adbcConnection == null) {
console.println("Geen verbinding met de database.");
}
else {
//console.println("getTables functie");
getTables();
}
}

catch(e) {
console.println("Helaas is er een fout opgetreden: "+e);
}
}
 
function getTables() {
adbcConnectionStatement = adbcConnection.newStatement();
tableArray = adbcConnection.getTableList();
sortByCreationDate();
}

My Product Information:
Acrobat Pro 9.0, Windows
fixmedia
Registered: Nov 15 2010
Posts: 17
...
function sortByCreationDate() {
var getDate = "SELECT table_name AS projectnummer FROM INFORMATION_SCHEMA.TABLES WHERE table_schema='vanwaning' ORDER BY create_time ASC";
try {
adbcConnectionStatement = adbcConnection.newStatement();
adbcConnectionStatement.execute(getDate);
cb.clearItems();

fixmedia
Registered: Nov 15 2010
Posts: 17
...
//FOR SOME REASON FOR LOOPS WILL NOT SHOW???!!!
adbcConnectionStatement.nextRow();
var row = adbcConnectionStatement.getRow();
app.alert("row.projectnummer.value: "+row.projectnummer.value,3);
cb.insertItemAt(tableArray[i].name, tableArray[i].name, i); //row.projectnummer.value
}
}
catch(e) {
app.alert("Helaas is er iets fout gegaan bij het sorteren van de databasetabellen.");
}
}
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
You have a fairly complex problem. It's hard to tell without examining your setup in details, but it seems that Acrobat is receiving mixed character code data. Or it could be a byte ordering issue. The solution, as you've already seen is to force a consistent Unicode encoding across the entire DB.

Also keep in mind that ADBC was never really well supported, and it's been slated for deprecation for several years.

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

fixmedia
Registered: Nov 15 2010
Posts: 17
@thomp: is there any way from acrobat (with ADBC in mind) to force a consistent unicode or has this to be done on the MySQL side? I'm using Wamp Server 2 locally with the standard settings on install and never had this kind of strange Unicode problems with other non-acrobat solutions.

Offtopic: for a beginner all-round programmer I find ADBC very usefull for (small) projects like this one. A newer better solution to ADBC is recommended (by UVSAR) XFA (Life Cycle if I understand correctly) or SOAP but both are in my perseption a bigger bone to deal with. ADBC is pretty easy to use and does what I want it to do.
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
Have you looked at the information in the ColumInfo. It's entirely possible that the data is not being delivered as a string. That it's in a format that could be reinterpreted or that it's Hex encoded data in a string. When Hex encoded data is displayed as a string it can often be interpreted as Chineese, or other odd characters.

Otherwise you're getting into some iffy territory here. It may be possible to reinterpret the data returned from ADBC by converting the strings into a stream object, and then examining the characters. But I don't know. If the data is delivered as a string there is probably some translation being done on the data before it arrives.

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

fixmedia
Registered: Nov 15 2010
Posts: 17
Accepted Answer
@thomp: I checked the typeName which returned 'varchar' so I thought it had to be a strange characterset problem and indeed it was. I still haven't a clue why it still only works with the latin1_swedish_ci collation but for now it works.

The query which saved me for the moment:
SELECT (SELECT CONVERT(table_name USING latin1)) AS aanmaak_volgorde FROM INFORMATION_SCHEMA.TABLES WHERE table_schema='target-database' ORDER BY create_time DESC

This returns me a list which is properly shown in my Acrobat 9 Pro form. What I said, still clueless why in that particular characterset. I tested in a complete configured UTF8 Unicode enviroment but still didn't work unless it's particularly in latin1(_swedish_ci).

Hope to find the real reason why this occurs. Thanks for now.