machine learning - How can a PMML reader interpret the predict value's type? -


hi have pmml generated logistic regression model using r follows. first part of pmml shown here.

<?xml version="1.0"?> <pmml version="4.2" xmlns="http://www.dmg.org/pmml-4_2" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation="http://www.dmg.org/pmml-4_2 http://www.dmg.org/v4-2/pmml-4-2.xsd">  <header copyright="copyright (c) 2015 upeksha" description="generalized linear regression model">   <extension name="user" value="upeksha" extender="rattle/pmml"/>   <application name="rattle/pmml" version="1.4"/>   <timestamp>2015-12-02 08:41:27</timestamp>  </header>  <datadictionary numberoffields="11">   <datafield name="responseaccountname" optype="continuous" datatype="double"/>   <datafield name="regioncat" optype="categorical" datatype="string">    <value value="row"/>    <value value="europe"/>    <value value="nam"/>   </datafield>   <datafield name="titlecat" optype="categorical" datatype="string">    <value value="1"/>    <value value="2"/>    <value value="3"/>    <value value="4"/>   </datafield>   <datafield name="rlmaxtitle" optype="categorical" datatype="string">    <value value="1"/>    <value value="2"/>    <value value="3"/>    <value value="4"/>   </datafield>   <datafield name="act1_rate" optype="continuous" datatype="double"/>   <datafield name="act2_rate" optype="continuous" datatype="double"/>   <datafield name="act3_rate" optype="continuous" datatype="double"/>   <datafield name="act4_rate" optype="continuous" datatype="double"/>   <datafield name="act5_rate" optype="continuous" datatype="double"/>   <datafield name="act6_rate" optype="continuous" datatype="double"/>   <datafield name="accntact_rate" optype="continuous" datatype="double"/>  </datadictionary>  <generalregressionmodel modelname="logistic_regression" modeltype="generalizedlinear" functionname="regression" algorithmname="glm" distribution="binomial" linkfunction="logit">   <miningschema>    <miningfield name="responseaccountname" usagetype="predicted"/>    <miningfield name="regioncat" usagetype="active"/>    <miningfield name="titlecat" usagetype="active"/>    <miningfield name="rlmaxtitle" usagetype="active"/>    <miningfield name="act1_rate" usagetype="active"/>    <miningfield name="act2_rate" usagetype="active"/>    <miningfield name="act3_rate" usagetype="active"/>    <miningfield name="act4_rate" usagetype="active"/>    <miningfield name="act5_rate" usagetype="active"/>    <miningfield name="act6_rate" usagetype="active"/>    <miningfield name="accntact_rate" usagetype="active"/>   </miningschema>   <output>    <outputfield name="predicted_responseaccountname" feature="predictedvalue"/>   </output> 

the outputfield datatype not present here. how pmml reader interpret it's type if so?

i checked pmml spec , says datatype outputfield not required. writing pmml reader , need know how interpretation done pmml this.

the datatype , optype attributes optional outputfield element, sane pmml producer should specify them anyway, make life easier pmml consumers.

if datatype attribute missing, can infer based on feature attribute of outputfield element. in current case, value of feature attribute set predictedvalue, means data type , operational type "copied" datafield element represents target field of model. here, target field (aka predicted field) called "responseaccountname", means value of outputfield element continuous double.


Comments