hi have pmml generated logistic regression model using r follows. first part of pmml shown here.
<?xml version="1.0"?> <pmml version="4.2" xmlns="http://www.dmg.org/pmml-4_2" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation="http://www.dmg.org/pmml-4_2 http://www.dmg.org/v4-2/pmml-4-2.xsd"> <header copyright="copyright (c) 2015 upeksha" description="generalized linear regression model"> <extension name="user" value="upeksha" extender="rattle/pmml"/> <application name="rattle/pmml" version="1.4"/> <timestamp>2015-12-02 08:41:27</timestamp> </header> <datadictionary numberoffields="11"> <datafield name="responseaccountname" optype="continuous" datatype="double"/> <datafield name="regioncat" optype="categorical" datatype="string"> <value value="row"/> <value value="europe"/> <value value="nam"/> </datafield> <datafield name="titlecat" optype="categorical" datatype="string"> <value value="1"/> <value value="2"/> <value value="3"/> <value value="4"/> </datafield> <datafield name="rlmaxtitle" optype="categorical" datatype="string"> <value value="1"/> <value value="2"/> <value value="3"/> <value value="4"/> </datafield> <datafield name="act1_rate" optype="continuous" datatype="double"/> <datafield name="act2_rate" optype="continuous" datatype="double"/> <datafield name="act3_rate" optype="continuous" datatype="double"/> <datafield name="act4_rate" optype="continuous" datatype="double"/> <datafield name="act5_rate" optype="continuous" datatype="double"/> <datafield name="act6_rate" optype="continuous" datatype="double"/> <datafield name="accntact_rate" optype="continuous" datatype="double"/> </datadictionary> <generalregressionmodel modelname="logistic_regression" modeltype="generalizedlinear" functionname="regression" algorithmname="glm" distribution="binomial" linkfunction="logit"> <miningschema> <miningfield name="responseaccountname" usagetype="predicted"/> <miningfield name="regioncat" usagetype="active"/> <miningfield name="titlecat" usagetype="active"/> <miningfield name="rlmaxtitle" usagetype="active"/> <miningfield name="act1_rate" usagetype="active"/> <miningfield name="act2_rate" usagetype="active"/> <miningfield name="act3_rate" usagetype="active"/> <miningfield name="act4_rate" usagetype="active"/> <miningfield name="act5_rate" usagetype="active"/> <miningfield name="act6_rate" usagetype="active"/> <miningfield name="accntact_rate" usagetype="active"/> </miningschema> <output> <outputfield name="predicted_responseaccountname" feature="predictedvalue"/> </output>
the outputfield datatype not present here. how pmml reader interpret it's type if so?
i checked pmml spec , says datatype outputfield not required. writing pmml reader , need know how interpretation done pmml this.
the datatype
, optype
attributes optional outputfield
element, sane pmml producer should specify them anyway, make life easier pmml consumers.
if datatype
attribute missing, can infer based on feature
attribute of outputfield
element. in current case, value of feature
attribute set predictedvalue
, means data type , operational type "copied" datafield
element represents target field of model. here, target field (aka predicted field) called "responseaccountname", means value of outputfield
element continuous double.
Comments
Post a Comment