I have three columns that I wish to combine to create a column containing the coordinates of genes in the genome:
– “chrom”: a text column containing chromosome number e.g. chr1, chr2…
– “start”: a number column containing an integer, the base number of the begining of the gene
– “end”: a number column containing an integer, the base number of the end of the gene
I need as a result “chr1: 200000-300000” for example
I combined the three using the Text Expression column which gives tokens to select the three columns. It looked OK. But if the integers are above 1×10^8 they are converted to scientific notation. This is a problem because I absolutely must keep the integer notation to use these coordinates in a genome browser.
There is a possibility to specify “integer” at the right of the tokens for the columns but this doesn’t work.
I tried to convert first the “start” and “end” columns as Text Columns, thinking they wouldn’t be considered as numbers and stay as integers, it doesn’t work either.
Is there a way to globally deactivate the automatic conversion of the notation of large numbers?10 months, 2 weeks ago dgteamModerator
There is no preference option for deactivating the default formatting for numbers when you concatenate them with the Text expression.
What might work is to change the format to “String from Column”. Click to the right of the token to get this formatting option. Then you don’t have to change the format of your number column, but it is interpreted as text when the fields are concatenated.
Please try that and let us know if that solves the problem.
One thing to also be aware of is that if you have millions of rows this process might be slow. The tokens are useful but that code is not as fast as we would like and not something we have control over (internal to the macOS). If this combined field is something you need to create once, then you could convert the Text expression column to a Text column, once it is set up properly.
If you need it to stay dynamic and notice that the file is sluggish, please let us know.
Thank you a lot. Using “string form column” solves the problem.
But I don’t always have access to the options on the right of the token… Sometimes I have access for one token and not the other, although the two are really similar kind of columns. I don’t understand why. At the begining I thought that it was because my coordinates columns were locked, but it’s more erratic thant that. I’ll try to understand.
Indeed I did convert the conbined field immediately to a Text column to optimize the processing, I didn’t mention that in the first post as it didn’t change my number notation format problem. I only have 14973 rows but it’s probably worth it.10 months, 2 weeks ago dgteamModerator
In case this was not clear, for the token menu you have to click directly on the “^” symbol on the right side of the token. Does that help? If not and you have a file where a token does not show the menu, it would help us to examine the file.
If possible please email the file. In DataGraph, go to the Help menu to get our email.
It was clear, I had discovered the ^ before, when I tried to change for “Integer” to prevent scientific notation. I’ll email the file if this happens again.
In some cases I can’t have access to the right side menu of the token just after I entered the token, it displays “Select column in list” when I click on the ^. I must click outside and come back to manage to open it. I don’t think it’s a problem with my file. Probably rather a matter of in which order to do things and maybe a bit of slow processing with a big file. In the end I always managed so far. And the scientific notation conversion problem is solved. Thank you a lot for your amazing responsiveness!
You must be logged in to reply to this topic.