Attribute split option in scripting language

falqnfalqn Global Mapper UserPosts: 123Trusted User
edited January 2015 in Suggestion Box
Hi,
Very useful in working with attributes will be something like "attribute split", ex. ATTR_0 has value 'A,C' and with command "split" I can easly create ATTR_1 with value 'A' and ATTR_2 with value 'B' using as "split separator ','.
This options could be included in CALC_ATTR command, a new TYPE=SPLIT operation should be added, maybe SEP_STR for "split separator" could be used.

Regards
Wojtek

Comments

  • bmg_bobbmg_bob Global Mapper Programmer Posts: 2,135
    edited December 2014
    Hello Wojtek,

    I have added feature request #15065 to our task list so we can evaluate inclusion of this functionality in a future version of Global Mapper. We will post back to this thread when the status of that feature request changes.

    Cheers,

    Bob
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited December 2014
    Hello Wojtek,

    Yes, I can definitely see the usefulness of this type of operation. I'm not sure whether we will implement this type of operation in quite that way, for this reason: we already have a new feature in the works that will allow you to use a formula to create new attribute values (similar to the raster calculator formulas), and will include several string operations.

    One of the string operations is called 'clip', and you would be able to use it to clip delimited text out of a string. To use your example, to clip "A" out of the value of ATTR_0, you would use a formula like: clip( ATTR_0, "", ","). To clip "B" out of ATTR_0, you would use: clip( ATTR_0, ",", "").

    This command will be scriptable, just as the current "CALC_ATTR" command is. The new feature hasn't been finalized yet, but we have it scheduled for a release that's not too far away.

    Thanks for the suggestion. If, after the above, you still want the new option for the existing CAL_ATTR command, I'll be happy to add a ticket for that. And if you have suggestions for the new formula feature, please feel free to add them here. In either case, we will post back here with updates.

    Best regards,

    ~Jeff

    Edit: I see that my colleague Bob beat me to a reply here. Either way, we will evaluate and let you know. Thanks again.
  • falqnfalqn Global Mapper User Posts: 123Trusted User
    edited December 2014
    Hi Jeff,
    Without manual I can not understand how exactly this "clip" operation will works (hard day behind me ;) but I feel that this new command will give great opportunities to manipulate with attributes values. So, I think my idea with expanding CALC_ATTR possibilities is not actual anymore.
    I'm looking forward to try new command.
    Regards
    Wojtek
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited December 2014
    Hi Wojtek,

    Well, the feature hasn't been finalized yet, but the basic idea is that clip() is designed for clipping a string out of another string. It has three parameters. The first is the input string (usually designated by an attribute name), and the second and third strings contain string delimiters.

    The first delimiter parameter contains the set of characters that delimit the left side of the string we will return; it just means that we read the input string until we find one of the characters in the first delimiter string, and we discard all of what we have read from the returned string (including the delimiter character). As a special case, if the first delimiter string is empty, we ignore it and remove nothing.

    At that point, we consider the second delimiter string. It's the set of characters that delimit the right side of the string that we will return. We scan the remainder of the input string until we find a character that is in the second delimiter string, and we discard that character and all remaining characters in the string. As a special case, if the second delimiter string is empty, we ignore it and discard nothing.

    Whatever is left over is returned as the result of clip().

    So for example:
    * clip( "123,456,789", "", "" ) would return "123,456,789" (both delimiters empty)
    * clip( "123,456,789", "", "," ) would return "123" (first delimiter empty)
    * clip( "123,456,789", ",", "" ) would return "456,789" (second delimiter empty)
    * clip( "123,456,789", ",", "," ) would return "456" (both delimiters non-empty)

    That's the idea in a nutshell. Pretty simple, really, and there is nothing stopping us from adding more powerful functions as they're identified by our users.

    ~Jeff
  • falqnfalqn Global Mapper User Posts: 123Trusted User
    edited December 2014
    Hi Jeff,
    It's all clear now. I was confused by spaces after commas, maybe they are not necessary... ex: clip("123_456_789","","_") would return 123. Great feature, I'm waiting for it.
    Regards
    Wojtek
  • MykleMykle Global Mapper User Posts: 451Trusted User
    edited December 2014
    So how would you retrieve "789", by a two-step operation?

    clip( clip( "123,456,789", ",", "" ), ",", "" )

    The first (inner) returns "456,789"
    The second (outer) returns "789" (all after first delimiter)

    The syntax is versatile for the first few arguments, then it gets complicated.
    Mykle
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited December 2014
    Hi Mykle,

    Yes, that's how you'd do it using the proposed clip() fuction. Probably not going to scale all that prettily, but it will work. On the other hand, that might also lead us in the direction of using regular expressions to do matching and replacement operations, which is a distinct possibility. How do you feel about something like the following?

    match( "123,456,789", "[0-9]*$" )

    :)

    ~Jeff
  • MykleMykle Global Mapper User Posts: 451Trusted User
    edited December 2014
    Hi Jeff (and Wojtek),

    I am by no means an expert, let alone a regular user, of regular expressions. A very quick Google search suggests that there are several schemes that return different styles of results. That said, methods that use syntax familiar to users would be good (like your thought to use something similar to the raster calculator formulas).

    I get the impression that your example will return the string that contains characters that match your pattern. So I don't appreciate how a set of values are "clipped" out of the input string, but that may be my lack of appreciation for the power of regular expressions, and not what you intend.

    That said, if I were interested in extracting values from a string, I might want to do ONE of the following:
    a. return a set of values
    b. iterate through a loop, pulling the next value from the string
    c. return the nth value from a string

    That assumes that the function arguments include the string, a delimiter, and an integer value for "n"
    (this works for options b. and c. to return one value at a time, assuming that a. returning a set of values on one call may be more than you want to get into).

    If you want to be able to use the function within a formula, then c. looks to be the best to me. It would be better (more easily understood) than a nested method.

    On the funny side, I don't see how you would call the function with the string 'A,C' and return 'A' and 'B', as in the first example provided by Wojtek D:) Hopefully his long day is done and he has enjoyed a favorite beverage. It will be a few hours from now before I can do the same!

    Cheers,
    Mykle
  • falqnfalqn Global Mapper User Posts: 123Trusted User
    edited December 2014
    Hi Jeff (and Mykle),
    I will be happy to see new matching and replacement operations. Example of "match" operation you provided is interesting.
    I was thinking about expanding REPLACE_STR parameter to accept wildcards like '*' and '?' Ex. I want to replace with 'X' every value which begins with 'abc', it could be:
    REPLACE_STR="ATTR_1=abc*=X"
    at this moment this operation is not possible. I hope that with new clip/match parameters I will be able to do this kind of replacements.

    Yes, I have noticed that we switch from 'A,C' to 'A' and 'B' haha ;)
  • GeoGeo Global Mapper User Posts: 92Trusted User
    edited December 2014
    JeffH@BMG wrote: »
    ..../...
    On the other hand, that might also lead us in the direction of using regular expressions to do matching and replacement operations, which is a distinct possibility. How do you feel about something like the following?

    match( "123,456,789", "[0-9]*$" )

    :)

    ~Jeff

    it would be wonderful!

    i've posted 5 demo samples here.
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited December 2014
    Hi Mykle & Wojtek,

    Thanks for your thoughts. Here's what I am thinking:
    Mykle wrote: »
    I am by no means an expert, let alone a regular user, of regular expressions. A very quick Google search suggests that there are several schemes that return different styles of results. That said, methods that use syntax familiar to users would be good (like your thought to use something similar to the raster calculator formulas).
    Indeed, there are a number of regular expression dialects available, and supported by our C++ library. They're all roughly equivalent. I'll probably just pick one and work with that (I have one in mind). Regular expressions would only be used in special functions provided by the library, like 'match' above; the "[0-9]*$" above is just short for 'zero or more digits immediately preceding the end of the string'. The match() function returns that part of the string that matches that pattern, i.e. (match( "abc123", "[0-9]*$" ) would return "123".

    'clip' is pretty simple-minded. Using regular expressions opens a wide door.
    I get the impression that your example will return the string that contains characters that match your pattern. So I don't appreciate how a set of values are "clipped" out of the input string, but that may be my lack of appreciation for the power of regular expressions, and not what you intend.
    The formulas return a single value only, similar to the formulas used in the raster calculator. The raster calculator deals in numbers only, and any predefined functions provided reflect that (log, log10, min, max, etc.); formulas for use with attributes and scripting variables will be able to deal with numbers and/or strings, and there will be predefined string functions available. You'll need to do nesting or composition of operations to do complicated stuff.

    As above, regular expressions only come into play in the provided functions that use them. The functions 'clip' and 'match' do indeed pull substrings out of a source string.
    That said, if I were interested in extracting values from a string, I might want to do ONE of the following:
    a. return a set of values
    b. iterate through a loop, pulling the next value from the string
    c. return the nth value from a string
    I think returning a set of values won't happen any time soon, if at all (unless the set has cardinality 1, :)). One value is returned per formula execution. Iteration is not in the cards either, at the moment. An nth-value function might be possible, but I'm not sure how to present that yet. Think of these formulas as equivalent to spreadsheet formulas (though not as powerful).
    On the funny side, I don't see how you would call the function with the string 'A,C' and return 'A' and 'B', as in the first example provided by Wojtek D:) Hopefully his long day is done and he has enjoyed a favorite beverage. It will be a few hours from now before I can do the same!
    I just let that one ride. :)
    falqn wrote:
    I will be happy to see new matching and replacement operations. Example of "match" operation you provided is interesting.
    I was thinking about expanding REPLACE_STR parameter to accept wildcards like '*' and '?' Ex. I want to replace with 'X' every value which begins with 'abc', it could be:
    REPLACE_STR="ATTR_1=abc*=X"
    at this moment this operation is not possible. I hope that with new clip/match parameters I will be able to do this kind of replacements.
    Sure. I think that with formulas, a simple replace() function that uses regular expressions is feasible. You'd use something like:
    "replace( ATTR_1, 'abc', 'X' )"

    Just to be a little more specific, we would have a new command CALC_ATTR_FORMULA, similar to CALC_ATTR, that allows you to specify a formula parameter rather than the parameters. The formula is calculated in the context of the feature it is being run against, and the result is used to populate the requested attribute for that feature.
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited December 2014
    Hi folks,

    Got some work on this done over the holidays. Based on what I have now (but still not finalized), we will have the following regular expression functions:
    • match( string, pattern ) : return 1 (true) if the string matches the given pattern
    • search( string, pattern ) : returns the first match of the pattern in the string if any; the empty string otherwise.
    • replace( string, pattern, string ) : replace all occurrences of the patter in the first string with the second string, and returns the result.

    The flavor of regular expressions we'll be using is documented here: ECMAScript syntax - C++ Reference. Careful readers will note that essentially we are mirroring the C++ <regex> functionality.

    Y'all should be able to have some fun with that.

    Merry Christmas!

    ~Jeff
  • falqnfalqn Global Mapper User Posts: 123Trusted User
    edited January 2015
    Hi Jeff,
    New, scriptable (I hope) command would be great. How do you think, when this command should be finalized?
    Happy New Year!
    Wojtek
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited January 2015
    Hi Wojtek,

    Happy New Year to you, too. The feature is scheduled for the 16.1 release that's upcoming pretty soon, but I've sworn a solemn oath not to give out any dates. We will be have a beta before we release it, so look for that announcement.

    thanks,

    ~Jeff
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited January 2015
    For now, I'll just post the Scripting reference as it stands today. It's similar to the older CALC_ATTR script command, but you have a formula instead of explicit operands. Hope that this helps.
    CALC_ATTR_FORMULA
    The CALC_ATTR_FORMULA command allows you to calculate a new attribute value (or update the value for an existing attribute) for features in a layer, based on a formula that may contain numbers, strings or other attributes. A number of functions may be used as well.
    The following parameters are supported by the command:
    • FILENAME - filename of the layer to update. If an empty value is passed in, all loaded vector layers will be updated. When running the script in the context of the main map view (including loading a workspace) you can also pass in the value 'USER CREATED FEATURES' to have the 'User Created Features' layer updated or 'SELECTED LAYERS' to have any layers selected in the Control Center If you don't pass anything in all vector layers will be operated on.
    • NEW_ATTR - specifies the attribute value to create or update. See special [URL="file:///C:/Development/HTMLHelp/ScriptReference.html#attr_name"]Attribute Name[/URL] parameter details.
    • FORMULA specifies the formula to be used to create the new attribute values. See the [URL="file:///C:/Development/HTMLHelp/Formula_Calculator.htm"]formula calculator reference [/URL]documentation for details.
    Here is a sample of creating a new elevation attribute in feet from an elevation attribute (ELEV_M) in meters, including with an appended unit string.
    GLOBAL_MAPPER_SCRIPT VERSION=1.00
    // Create new ELEV_FT attribute with attribute in feet in any loaded layers
    CALC_ATTR_FORMULA NEW_ATTR="ELEV_FT" FORMULA="ELEV_MT * 3.2808"

    // Append the unit name to the new attribute
    CALC_ATTR_FORMULA NEW_ATTR="ELEV_FT" FORMULA="ELEV_FT + ' ft'"



    The formula calculator reference above leads to a different page, shown below. The Attribute Name link should be in the older script reference, and is identical to that of CALC_ATTR. I'll post Formula Calculator info into a separate post, as it's too long for this one.
  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited January 2015
    Formula Calculator

    The formula calculator is used by the CALC_ATTR_FORMULA command and the DEFINE_VAR command. It allows you to calculate a new value (or update an existing value) based on a formula. In the case of CALC_ATTR_FORMULA, the calculator allows you to calculate new attribute values based on a feature's attributes. In the case of DEFINE_VAR, the calculator allows you to calculate new script variable values. Formulas may combine numbers and strings and either feature attribute references or script variable references. A number of functions are provided to operate on these values.

    Many features of the formula calculator are identical when used in the two applications; where different, the difference will be noted below.
    Formula reference

    Formulas are expressions that can combine strings, numbers and attributes to form a new value. The formula language also includes several functions that may be called to manipulate formula values.

    Note that attributes are fundamentally strings, even though we may wish to use them as numbers, which may cause ambiguity in formulas. Rules for disambiguation are as noted below. In many cases, conversions are performed automatically; for example, when a string expression is expected, a conversion from numeric type is performed, or vice-versa. If required, there are functions to turn an expression into one or the other data type; see the NUM, STR, and BOOL functions below. Conversions are not guaranteed, particularly string-to-numeric conversions (for example, converting the string "four" to a number would give the result 0); these are generally ignored in the calculator.
    Values

    • Strings - Strings are sequences of characters, delimited by opening and closing quotes, either single (') or double ("). The opening quote must match the closing quote. To include a quote character in your formula, you may do so by preceding it with a backslash character (\). You may also use a single-quote character inside a double-quote delimited string without escaping it, and vice-versa. To include a backslash character in your string (\), precede it with another backslash character. Here are some examples of strings:

      "a string"
      'another string'
      "double-quote character (\")". The calculator would see this as: "double-quote character (")". Conversely, you could also use 'double-quote character (")'

      A note on using strings in script files: since parameters in script files are delimited by double quote characters ("), you should use single quote characters (') to delimit your strings in your formulas.
    • Numbers - Here are some examples of numbers:

      3.14159
      42
      1234e-3 (scientific notation; the result is 1.234)
      123,456 (European notation; the result is 123.456)
    • Attributes (for the Attribute Calculator and CALC_ATTR_FORMULA command only) - Attributes are named values attached to features. Attribute names may be specified in several ways. For simple attribute names, with no special characters or embedded spaces, the name itself will suffice. For attribute names that have embedded spaces or non alphanumeric characters, you can delimit the name with percent characters (%); similar to strings, to include a % character in your attribute name, precede it with a backslash character (\). And if you wish to use the result of a string expression as an attribute name, then you can use the attr() function to do so. Note that attribute names are matched in a case-insensitive manner; "ELEVATION" is the same as "elevation". Here are some examples of attribute names:

      ELEVATION
      %color name%
      %\%Count% - the calculator sees the attribute name as %Count
      attr( "ATTRIBUTE" + 5 ) - the calculator sees the name as ATTRIBUTE5
    • Special Attributes (for the Attribute Calculator and CALC_ATTR_FORMULA command only) - Special attributes are values that are derived from features, but are not proper attributes, for example, feature name, feature's layer name, etc. Special attributes are delimited by the less-than (<) and greater-than (>) characters, respectively. Note that attribute names are matched in a case-insensitive manner; <Feature Name> is the same as <feature name>. These are the special attributes:

      <Feature Name> - the display label of the feature
      <Feature Description> - the value is the description of the feature (often the same as the feature type)
      <Feature Desc> - Shortened feature description
      <Feature Layer Name> - the name of the layer that the feature is in
      <Feature Type> - the classification of the feature
      <Index in Layer> - the 0-based index of the feature in the layer that the feature is in
      <Feature Source Filename> - the filename of the layer the feature is in
      <Feature Layer Group Name> - the name of the layer group that the feature is in
    • Variables (for the DEFINE_VAR command only) - Variables are named values defined in a Global Mapper script. Variable names may be specified in two ways. For simple variable names, with no special characters or embedded spaces, the name itself will suffice. If you wish to use a variable that contains embedded spaces, you can use the ATTR function to do so. Note that feature names are matched in a case-insensitive manner; "VAR" is not the same variable as "var". Here are some examples of variable names:

      VAR
      attr( 'TIME STAMP' ) - the calculator sees the variable name as "TIME STAMP"
    • Special Variables (for the DEFINE_VAR command only) - There are a number of predefined variables that you can use in your DEFINE_VAR formulas. These are the special variables:

      SCRIPT_FILENAME - the full path and filename of the running script
      TIMESTAMP - the current date and time in system format
      DATE - the current date in system format
      TIME - current time in system format
      TIME_SINCE_START - the number of seconds since the script starting running
      TIME_SINCE_LAST_LOG - the number of seconds since the last use of this variable
    Formula operators

    Formulas use various mathematical and logical operators to form a result, similar to spreadsheet formulas. They are (in order of precedence, low-to-high):
    • OR : logical OR: both operands are treated as boolean values (see the BOOL() function below), and a boolean value is returned (1 or 0).
    • AND : logical AND: both operands are treated as boolean values (see BOOL() function below), and a boolean value is returned (1 or 0).
    • =, <> : comparison operators: equals and not equals, respectively. The comparison is numeric only if both parameters are numeric; otherwise the comparison is as strings.
    • <, <=, >, >= : relational operators: less than, less than or equal to, greater than, and greater than or equal to, respectively. The comparison is numeric only if both parameters are numeric; otherwise the comparison is as strings.
    • +, - : additive operators: plus and minus, respectively. Note that for the '+' operator, if both operands are numeric, then the operation will be numeric addition; otherwise, the operation will be string concatenation. The operands for '-' are assumed to be numeric.
    • *, / : multiplicative operators: times and divide, respectively. The operands are assumed to be numeric.
    • ^ : exponentiation
    • +, -, NOT : unary operators: plus, minus, and logical NOT respectively. The operands for unary plus and minus are assumed to be numeric; the operand for unary NOT is assumed to be boolean.

    You may also use parentheses to specify order of operations. In the absence of parentheses, higher precedence operations are performed before lower precedence operations. That is, in the formula "a + b * c", the result is the value of 'a' plus the product of 'b' and 'c' (equivalently "a + (b * c)").
    Functions

    The calculator provides a number of formulas to aid in calculation of values. Note that function names are case-insensitive; e.g., abs is the same as ABS
    • MAX( expression1, expression2 ) : Maximum value. The comparison is numeric only if both parameters are numeric; otherwise the comparison is as strings.
    • MIN( expression1, expression2 ) : Minimum value. The comparison is numeric only if both parameters are numeric; otherwise the comparison is as strings.
    • LOG( expression ) : Natural logarithm. The expression parameter is assumed to be a numeric value.
    • LOG10( expression ) : Base 10 logarithm. The expression parameter is assumed to be a numeric value.
    • ABS( expression ) : Absolute value. The expression parameter is assumed to be a numeric value.
    • LEN( expression ) : Return the length of a string.
    • LEFT( expression1, expression2 ) : Extract characters from the left side of a string. The first parameter is a string, and the second parameter is a number that specified how many characters are to be extracted.
    • RIGHT( expression1, expression2 ) : Extract characters from the right side of a string. The first parameter is a string, and the second parameter is a number that specified how many characters are to be extracted.
    • MID( expression1, expression2, expression3 ) : Extract characters from the middle of a string. The first parameter is a string; the second parameter is a number that specified how many characters are to be extracted from the left side of the string; the third parameter is a number that specified how many characters are to be extracted from the right side of the string.
    • MAKEUPPER( expression ) : Convert a string to upper case.
    • MAKELOWER( expression ) : Convert a string to lower case.
    • NUM( expression ) : Convert to numeric value
    • STR( expression ) : Convert to string value
    • BOOL( expression ) : Convert to boolean value, either 1 or 0. For numeric values, a non-zero value converts to 1, and a zero value to 0. For strings, values of "true", "t", "yes", "y", and "1" (case-insensitive) all convert to 1, while any other value converts to 0.
    • ATTR( expression ) : Treat as attribute or variable name; the result of the expression is treated as an attribute name in the CALC_ATTR_FORMULA command, or as a variable name in the DEFINE_VAR command
    • EXISTS( name ) : Returns 1 (true) if the named attribute or variable is defined, and 0 (false) otherwise. Note that the name parameter must be either an attribute or variable name, or a single string that contains the name of the attribute or variable; string expressions (e.g. "abc" + "123") are not allowed.
    • CLIP( expression, expression2, expression3 ) : Return a substring from the first expression, whose beginning is delimited by one of the characters from expression2, and whose end is delimited by one of the characters from expression3. That is, the function reads the first string until it encounters a character from the second string, and then reads on until it encounters a character from the third string, and returns the characters between (not including the delimiter characters).
    • IF( expression, expression2, expression3 ) : Evaluate expression, and if it is true, return expression2; otherwise return expression3
    • MATCH( expression, expression2 ) : Determine whether the first expression matches the regular expression designated by the second expression, and if so, return 1 (true); otherwise, return 0 (false).
    • REPLACE( expression, expression2, expression3 ) : Replace any substrings in the first expression that match the regular expression designated by the second expression with the text of the third expression. If no matches are found, then the result is the original string.
    • SEARCH( expression, expression2 ) : Return the first substring from the first expression that matches the regular expression designated by the second expression. If no match is found, then the result is an empty string.
    Regular Expressions

    Regular expressions are used in the MATCH, SEARCH, and REPLACE functions. Regular expressions allow a very rich and powerful set of capabilities for performing pattern matching in strings, and there are many variations of regular expression languages. Regular expressions in Global Mapper are are specified using a modified version of the ECMAScript regular expression language, documented here: Modified ECMAScript regular expression grammar.

    Assume that the attribute "DLG3CODEPAIR is the string "120,250". Some examples:
    • The function match(DLG3CODEPAIR, "\d*,\d*") specifies a regular expression that is zero or more digits, followed by a comma, followed by zero or more digits. Consequently, the match function would return 1.
    • The function search(DLG3CODEPAIR, "\d*") specifies a regular expression that is zero or more digits. Consequently, the search function would return the string "120".
    • The function search(DLG3CODEPAIR, ",\d*") specifies a regular expression that is a comma followed by zero or more digits. Consequently, the search function would return the string ",250".
    • The function replace(DLG3CODEPAIR, "\d*,", "10" + ",") specifies a regular expression that is zero or more digits followed by a comma. Consequently, the replace function would return the string "10,250".

    Note that the SEARCH function only returns the first match in the string (if any); there is no facility for returning further matches.

    The third parameter of the REPLACE function is the replacement string. The replacement string allows you to specify literal characters as the replacement for the match pattern, but you may also use the following special sequences to specify other replacements:

    $n - The n-th backreference (i.e., a copy of the n-th matched group specified with parentheses in the regex pattern). n must be an integer value designating a valid backreference, greater than 0, and of two digits at most. E.g., '$2' specifies the second parenthesis-specified group in the match pattern.
    $& - A copy of the entire match
    $` - The prefix (i.e., the part of the target sequence that precedes the match).
    $´ - The suffix (i.e., the part of the target sequence that follows the match).
    $$ - A single $ character.

  • JeffH@BMGJeffH@BMG Global Mapper Developer Posts: 331Trusted User
    edited January 2015
    Examples for the formula calculator. Note that we're calling it the formula calculator as it's also usable for the DEFINE_VAR script command, so that you can create new script variables via formula. But the following examples use feature attributes, not script variables.

    Examples

    Note that the examples that follow are of attribute formulas; equivalent script formulas would be very similar, but would refer to script variables rather than feature attributes.

    Here is a sample of creating a new elevation attribute in feet from an elevation attribute (ELEV_M) in meters, including an appended unit string.

    GLOBAL_MAPPER_SCRIPT VERSION=1.00
    // Create new ELEV_FT attribute with attribute in feet in any loaded layers
    CALC_ATTR_FORMULA NEW_ATTR="ELEV_FT" FORMULA="ELEV_MT * 3.2808"
    // Append the unit name to the new attribute
    CALC_ATTR_FORMULA NEW_ATTR="ELEV_FT" FORMULA="ELEV_FT + ' ft'"

    Here is a sample of some text manipulation using various string and regular expression functions. Assume that the feature set has an attribute named DLG3CODEPAIR that is a string formatted in the form of a comma-separated pair of numbers, e.g. "123,456" (this is the familiar USGS Digital Line Graph major/minor code pair). To implement a script that pulls these numbers out as individual attributes, you could use the clip() function:

    GLOBAL_MAPPER_SCRIPT VERSION=1.00
    // Create new separate DLG code attributes
    CALC_ATTR_FORMULA NEW_ATTR="DLGCodeMajor" FORMULA="clip( DLG3CODEPAIR, '', ',' )"
    CALC_ATTR_FORMULA NEW_ATTR="DLGCodeMinor" FORMULA="clip( DLG3CODEPAIR, ',', '' )"

    Alternately, you could use regular expressions: '\d' is any digit, '+' means match one or more of the previous pattern, and '$' means match the end of the source string:
    GLOBAL_MAPPER_SCRIPT VERSION=1.00
    // Create new separate DLG code attributes
    CALC_ATTR_FORMULA NEW_ATTR="DLGCodeMajor" FORMULA="search( DLG3CODEPAIR, '\d+' )"
    CALC_ATTR_FORMULA NEW_ATTR="DLGCodeMinor" FORMULA="search( DLG3CODEPAIR, '\d+$' )"

    To create a new attribute that indicates whether the DLG code pair attribute exists and is valid::
    GLOBAL_MAPPER_SCRIPT VERSION=1.00
    // Create new attribute indicating DLG code pair exists and is valid
    CALC_ATTR_FORMULA NEW_ATTR="DLGCodeValid" FORMULA="if( exists(DLG3CODEPAIR) and match(DLG3CODEPAIR,'\d+,\d+'), 'VALID', INVALID' )"

    Here is a further example of text manipulation using the regular expression 'replace' function. Assume that the above feature set is known to contain DLG3CODEPAIR values that contain various problems in formatting, perhaps caused by user input errors. Although the standard format is described by the regular expression "\d+,\d+", actual values may contain extra spaces before or after the numbers, or the comma separator is not used or some other character, say ';' was entered. We'd like to tidy these values up so that they match the correct format. One way to do it might be to use a series of simple formulas to clean up the incorrect values. But the REPLACE function has some powerful features that can be used to fix these sorts of problems.

    First, let's define a regular expression that describes the above problems:

    '^ *(\d+) *(,|;)? *(\d+) *$'

    This says: first try to match any space characters, followed by one or more digits, followed by some spaces, optionally followed by either ',' or ';', followed by some spaces, followed by one or more digits, followed by some spaces.

    Note the use of parentheses to designate sub-patterns in the main pattern. This allows us to group smaller patterns so that we can express things like 'optional comma or semicolon' ('(,|;)?') easily, but more importantly, for the REPLACE function, it allows us to identify sub-patterns to the replacement parameter. In the replacement parameter, sub-patterns in the regular expression parameter are referenced using '$1', '$2', and so on. So in our example, the first digit string pattern is identified as '$1', the optional separator is identified as '$2', and the second digit string is identified by '$3'.

    Therefore, to get what we want, we would specify the replacement parameter as '$1,$3', meaning the first digit string, followed by a comma, followed by the second digit string. The resultant script command would be as follows (omitting any validation of the attribute existence):

    GLOBAL_MAPPER_SCRIPT VERSION=1.00
    // Clean up DLG code pair attribute errors
    CALC_ATTR_FORMULA NEW_ATTR="DLG3CODEPAIR" FORMULA="replace( DLG3CODEPAIR, ' *(\d+) *(,|;)? *(\d+) *', '$1,$3' )"


Sign In or Register to comment.