{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0cf1685b",
   "metadata": {},
   "source": [
    "# Using DEXiPy Data in Pandas and NumPy\n",
    "\n",
    "Demonstration of how to convert DEXiPy decision alternatives to Pandas and NumPy."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "643eec6e-e689-414a-95a8-44c987cb2f2c",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11b9b7a9-5677-4076-9bb6-02ed7aa77bdc",
   "metadata": {},
   "source": [
    "Assuming DEXiPy has been installed in Python."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4498ffa1",
   "metadata": {},
   "source": [
    "First some declarations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "f1c5426b",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "from dexipy.dexi import read_dexi_from_string, evaluate\n",
    "from dexipy.tests.testdata import CAR_XML"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e58698c",
   "metadata": {},
   "source": [
    "Read and print the Car model.\n",
    "In this case, read it from an XML string, imported from `dexipy.tests.testdata`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "367ef3ee",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DEXi Model: CAR_MODEL\n",
      "Description: Car demo\n",
      "index id          structure        scale                     funct   \n",
      "    0 CAR_MODEL   CAR_MODEL                                          \n",
      "    1 CAR         +- CAR           unacc; acc; good; exc (+) 12 3x4  \n",
      "    2 PRICE       |- PRICE         high; medium; low (+)     9 3x3   \n",
      "    3 BUY.PRICE   | |- BUY.PRICE   high; medium; low (+)             \n",
      "    4 MAINT.PRICE | +- MAINT.PRICE high; medium; low (+)             \n",
      "    5 TECH.CHAR.  +- TECH.CHAR.    bad; acc; good; exc (+)   9 3x3   \n",
      "    6 COMFORT     |- COMFORT       small; medium; high (+)   36 3x4x3\n",
      "    7 #PERS       | |- #PERS       to_2; 3-4; more (+)               \n",
      "    8 #DOORS      | |- #DOORS      2; 3; 4; more (+)                 \n",
      "    9 LUGGAGE     | +- LUGGAGE     small; medium; big (+)            \n",
      "   10 SAFETY      +- SAFETY        small; medium; high (+)           \n"
     ]
    }
   ],
   "source": [
    "dxi = read_dexi_from_string(CAR_XML)\n",
    "print(dxi)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f20bff4d",
   "metadata": {},
   "source": [
    "In addition to the model itself, `dxi` object contains alternatives - in this case, data of two cars.\n",
    "This data has not been printed above, but can be accessed as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "af003c2a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'name': 'Car1',\n",
       "  'CAR': 3,\n",
       "  'PRICE': 2,\n",
       "  'BUY.PRICE': 1,\n",
       "  'MAINT.PRICE': 2,\n",
       "  'TECH.CHAR.': 3,\n",
       "  'COMFORT': 2,\n",
       "  '#PERS': 2,\n",
       "  '#DOORS': 2,\n",
       "  'LUGGAGE': 2,\n",
       "  'SAFETY': 2},\n",
       " {'name': 'Car2',\n",
       "  'CAR': 2,\n",
       "  'PRICE': 1,\n",
       "  'BUY.PRICE': 1,\n",
       "  'MAINT.PRICE': 1,\n",
       "  'TECH.CHAR.': 2,\n",
       "  'COMFORT': 2,\n",
       "  '#PERS': 2,\n",
       "  '#DOORS': 2,\n",
       "  'LUGGAGE': 2,\n",
       "  'SAFETY': 1}]"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dxi.alternatives"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5825aeb0",
   "metadata": {},
   "source": [
    "Notice that alternatives are already fully evaluated, so there is no need to `evaluate()` them.\n",
    "\n",
    "Separate them in two variables:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "a02cfe80",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'name': 'Car1', 'CAR': 3, 'PRICE': 2, 'BUY.PRICE': 1, 'MAINT.PRICE': 2, 'TECH.CHAR.': 3, 'COMFORT': 2, '#PERS': 2, '#DOORS': 2, 'LUGGAGE': 2, 'SAFETY': 2}\n",
      "{'name': 'Car2', 'CAR': 2, 'PRICE': 1, 'BUY.PRICE': 1, 'MAINT.PRICE': 1, 'TECH.CHAR.': 2, 'COMFORT': 2, '#PERS': 2, '#DOORS': 2, 'LUGGAGE': 2, 'SAFETY': 1}\n"
     ]
    }
   ],
   "source": [
    "alt1 = dxi.alternatives[0]\n",
    "alt2 = dxi.alternatives[1]\n",
    "print(alt1)\n",
    "print(alt2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b25dd2ea",
   "metadata": {},
   "source": [
    "Now make alternatives that will contain more varied data (`None` values, value sets and distributions).\n",
    "First, make a base alternative for evaluation, `alt0`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "c8deaaeb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'CAR': None, 'PRICE': None, 'BUY.PRICE': 'low', 'MAINT.PRICE': 1, 'TECH.CHAR.': None, 'COMFORT': None, '#PERS': 2, '#DOORS': 2, 'LUGGAGE': 'medium', 'SAFETY': '*', 'name': 'Base'}\n"
     ]
    }
   ],
   "source": [
    "alt0 = dxi.alternative(\"Base\", \n",
    "        values = {\"BUY.PRICE\": \"low\", \"MAINT.PRICE\": 1, \"#PERS\": 2, \"#DOORS\": 2},\n",
    "        LUGGAGE = \"medium\", SAFETY = \"*\")\n",
    "print(alt0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5116d99d-9b43-41a0-b125-bfca2c54435b",
   "metadata": {},
   "source": [
    "Notice that `alt0` has not been evaluated and contains `None` values. We shall leave it as it is."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c11f6257",
   "metadata": {},
   "source": [
    "Now make alternatives `alts` and `altd`. They are evaluations of `alt0` using the \"set\" and \"prob\" methods, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "484efda5-89a9-4e75-87f7-799655546959",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'CAR': {0, 3}, 'PRICE': 2, 'BUY.PRICE': 2, 'MAINT.PRICE': 1, 'TECH.CHAR.': {0, 2, 3}, 'COMFORT': 2, '#PERS': 2, '#DOORS': 2, 'LUGGAGE': 1, 'SAFETY': {0, 1, 2}, 'name': 'Set'}\n",
      "{'CAR': [0.3333333333333333, 0.0, 0.0, 0.6666666666666666], 'PRICE': 2, 'BUY.PRICE': 2, 'MAINT.PRICE': 1, 'TECH.CHAR.': [0.3333333333333333, 0.0, 0.3333333333333333, 0.3333333333333333], 'COMFORT': 2, '#PERS': 2, '#DOORS': 2, 'LUGGAGE': 1, 'SAFETY': {0, 1, 2}, 'name': 'Base'}\n"
     ]
    }
   ],
   "source": [
    "alts = dxi.evaluate(alt0, method = \"set\")\n",
    "alts[\"name\"] = \"Set\"\n",
    "print(alts)\n",
    "altd = dxi.evaluate(alt0, method = \"prob\")\n",
    "alts[\"name\"] = \"Prob\"\n",
    "print(altd)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81079dc3-7632-4319-bb6b-13e2e584992c",
   "metadata": {},
   "source": [
    "## Converting alternatives to Pandas' `DataFrame`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d8fd755-dc9e-4df7-8dcc-738313380bb3",
   "metadata": {},
   "source": [
    "The main DEXiPy function for converting DEXiPy alternatives to data frames is `dexipy.eval.columnize_alternatives`.\n",
    "\n",
    "Let us columnize the five alternatives:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "dea54c1b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'name': ['Car1', 'Car2', 'Base', 'Prob', 'Base'], 'CAR': [3, 2, None, {0, 3}, [0.3333333333333333, 0.0, 0.0, 0.6666666666666666]], 'PRICE': [2, 1, None, 2, 2], 'BUY.PRICE': [1, 1, 'low', 2, 2], 'MAINT.PRICE': [2, 1, 1, 1, 1], 'TECH.CHAR.': [3, 2, None, {0, 2, 3}, [0.3333333333333333, 0.0, 0.3333333333333333, 0.3333333333333333]], 'COMFORT': [2, 2, None, 2, 2], '#PERS': [2, 2, 2, 2, 2], '#DOORS': [2, 2, 2, 2, 2], 'LUGGAGE': [2, 2, 'medium', 1, 1], 'SAFETY': [2, 1, '*', {0, 1, 2}, {0, 1, 2}]}\n"
     ]
    }
   ],
   "source": [
    "from dexipy.eval import columnize_alternatives\n",
    "colalt = columnize_alternatives([alt1, alt2, alt0, alts, altd])\n",
    "print(colalt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b6e4bcc-0e74-4072-9dd7-b3ccbe7d895f",
   "metadata": {},
   "source": [
    "Columnized alternatives make a suitable argument to create a Panda's `DataFrame`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "cd69a2a0-af6d-4deb-a461-aaf788aaf311",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   name                                                CAR  PRICE BUY.PRICE  \\\n",
      "0  Car1                                                  3    2.0         1   \n",
      "1  Car2                                                  2    1.0         1   \n",
      "2  Base                                               None    NaN       low   \n",
      "3  Prob                                             {0, 3}    2.0         2   \n",
      "4  Base  [0.3333333333333333, 0.0, 0.0, 0.6666666666666...    2.0         2   \n",
      "\n",
      "   MAINT.PRICE                                         TECH.CHAR.  COMFORT  \\\n",
      "0            2                                                  3      2.0   \n",
      "1            1                                                  2      2.0   \n",
      "2            1                                               None      NaN   \n",
      "3            1                                          {0, 2, 3}      2.0   \n",
      "4            1  [0.3333333333333333, 0.0, 0.3333333333333333, ...      2.0   \n",
      "\n",
      "   #PERS  #DOORS LUGGAGE     SAFETY  \n",
      "0      2       2       2          2  \n",
      "1      2       2       2          1  \n",
      "2      2       2  medium          *  \n",
      "3      2       2       1  {0, 1, 2}  \n",
      "4      2       2       1  {0, 1, 2}  \n"
     ]
    }
   ],
   "source": [
    "dfalt = pd.DataFrame(colalt)\n",
    "print(dfalt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6a997e7-c8e6-496d-a9f3-5314724e2fb0",
   "metadata": {},
   "source": [
    "Panda detects data types itself, notice that `dfalt` types are somewhat varied:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "5148e253-9bcf-4fc7-8929-f5e5c8a11a3a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "name            object\n",
      "CAR             object\n",
      "PRICE          float64\n",
      "BUY.PRICE       object\n",
      "MAINT.PRICE      int64\n",
      "TECH.CHAR.      object\n",
      "COMFORT        float64\n",
      "#PERS            int64\n",
      "#DOORS           int64\n",
      "LUGGAGE         object\n",
      "SAFETY          object\n",
      "dtype: object\n"
     ]
    }
   ],
   "source": [
    "print(dfalt.dtypes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6ecb328-38f7-44dd-a18a-22abd701e525",
   "metadata": {},
   "source": [
    "Converting more \"benign\" DEXiPy alternatives results in simpler data frames:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "ef90be27-509d-4622-850c-ffe608415ec5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   name  CAR  PRICE  BUY.PRICE  MAINT.PRICE  TECH.CHAR.  COMFORT  #PERS  \\\n",
      "0  Car1    3      2          1            2           3        2      2   \n",
      "1  Car2    2      1          1            1           2        2      2   \n",
      "\n",
      "   #DOORS  LUGGAGE  SAFETY  \n",
      "0       2        2       2  \n",
      "1       2        2       1  \n",
      "\n",
      " name           object\n",
      "CAR             int64\n",
      "PRICE           int64\n",
      "BUY.PRICE       int64\n",
      "MAINT.PRICE     int64\n",
      "TECH.CHAR.      int64\n",
      "COMFORT         int64\n",
      "#PERS           int64\n",
      "#DOORS          int64\n",
      "LUGGAGE         int64\n",
      "SAFETY          int64\n",
      "dtype: object\n"
     ]
    }
   ],
   "source": [
    "colalt2 = columnize_alternatives([alt1, alt2])\n",
    "dfalt2 = pd.DataFrame(colalt2)\n",
    "print(dfalt2)\n",
    "print(\"\\n\", dfalt2.dtypes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "141c2ca2-c0a4-4344-bc14-fdfab949dcf1",
   "metadata": {},
   "source": [
    "DEXiPy alternatives that contain value sets and distributions can be converted to numeric values using `dexipy.eval.convert_alternatives()`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "e31f0db3-26f5-444b-b79e-e187bae540b6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   name       CAR  PRICE  BUY.PRICE  MAINT.PRICE  TECH.CHAR.  COMFORT  #PERS  \\\n",
      "0  Car1  1.000000    1.0        0.5          1.0    1.000000      1.0    1.0   \n",
      "1  Car2  0.666667    0.5        0.5          0.5    0.666667      1.0    1.0   \n",
      "2  Base       NaN    NaN        NaN          0.5         NaN      NaN    1.0   \n",
      "3  Prob  0.500000    1.0        1.0          0.5    0.555556      1.0    1.0   \n",
      "4  Base  0.666667    1.0        1.0          0.5    0.555556      1.0    1.0   \n",
      "\n",
      "     #DOORS  LUGGAGE  SAFETY  \n",
      "0  0.666667      1.0     1.0  \n",
      "1  0.666667      1.0     0.5  \n",
      "2  0.666667      NaN     NaN  \n",
      "3  0.666667      0.5     0.5  \n",
      "4  0.666667      0.5     0.5  \n",
      "\n",
      " name            object\n",
      "CAR            float64\n",
      "PRICE          float64\n",
      "BUY.PRICE      float64\n",
      "MAINT.PRICE    float64\n",
      "TECH.CHAR.     float64\n",
      "COMFORT        float64\n",
      "#PERS          float64\n",
      "#DOORS         float64\n",
      "LUGGAGE        float64\n",
      "SAFETY         float64\n",
      "dtype: object\n"
     ]
    }
   ],
   "source": [
    "from dexipy.eval import convert_alternatives\n",
    "cvagr = convert_alternatives(dxi, [alt1, alt2, alt0, alts, altd], aggregate = \"mean\")\n",
    "# with some tweaking to exclude model root attribute and include \"name\":\n",
    "colagr = columnize_alternatives(cvagr, attributes = [\"name\"] + dxi.non_root_ids)\n",
    "dfagr = pd.DataFrame(colagr)\n",
    "print(dfagr)\n",
    "print(\"\\n\", dfagr.dtypes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "357ad06e-7b77-4ef0-bbc2-7817e8ef298c",
   "metadata": {},
   "source": [
    "It is also possible to make a `DataFrame` consisting of a textual interpretation of DEXi values using `DexiModel.textualize_alternatives()`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "2f3186f9-2efc-4831-85b7-5808654e943a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   name                           CAR   PRICE BUY.PRICE MAINT.PRICE  \\\n",
      "0  Car1                           exc     low    medium         low   \n",
      "1  Car2                          good  medium    medium      medium   \n",
      "2  Base                          None    None       low      medium   \n",
      "3  Prob              ('unacc', 'exc')     low       low      medium   \n",
      "4  Base  {'unacc': 0.33, 'exc': 0.67}     low       low      medium   \n",
      "\n",
      "                                 TECH.CHAR. COMFORT #PERS #DOORS LUGGAGE  \\\n",
      "0                                       exc    high  more      4     big   \n",
      "1                                      good    high  more      4     big   \n",
      "2                                      None    None  more      4  medium   \n",
      "3                    ('bad', 'good', 'exc')    high  more      4  medium   \n",
      "4  {'bad': 0.33, 'good': 0.33, 'exc': 0.33}    high  more      4  medium   \n",
      "\n",
      "                        SAFETY  \n",
      "0                         high  \n",
      "1                       medium  \n",
      "2                            *  \n",
      "3  ('small', 'medium', 'high')  \n",
      "4  ('small', 'medium', 'high')  \n"
     ]
    }
   ],
   "source": [
    "txtalt = dxi.textualize_alternatives([alt1, alt2, alt0, alts, altd], decimals = 2)\n",
    "coltxt = columnize_alternatives(txtalt, attributes = [\"name\"] + dxi.non_root_ids)\n",
    "dftxt = pd.DataFrame(coltxt)\n",
    "print(dftxt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d65a1769-b4ef-4213-aedc-45ff4f323a24",
   "metadata": {},
   "source": [
    "This also works with text data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "ed276071-c2fd-49aa-ace9-3268e5f2b0ec",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   name                           CAR   PRICE BUY.PRICE MAINT.PRICE  \\\n",
      "0  Car1                           exc     low    medium         low   \n",
      "1  Car2                          good  medium    medium      medium   \n",
      "2  Base                          None    None       low      medium   \n",
      "3  Prob              ('unacc', 'exc')     low       low      medium   \n",
      "4  Base  {'unacc': 0.33, 'exc': 0.67}     low       low      medium   \n",
      "\n",
      "                                 TECH.CHAR. COMFORT #PERS #DOORS LUGGAGE  \\\n",
      "0                                       exc    high  more      4     big   \n",
      "1                                      good    high  more      4     big   \n",
      "2                                      None    None  more      4  medium   \n",
      "3                    ('bad', 'good', 'exc')    high  more      4  medium   \n",
      "4  {'bad': 0.33, 'good': 0.33, 'exc': 0.33}    high  more      4  medium   \n",
      "\n",
      "                        SAFETY  \n",
      "0                         high  \n",
      "1                       medium  \n",
      "2                            *  \n",
      "3  ('small', 'medium', 'high')  \n",
      "4  ('small', 'medium', 'high')  \n"
     ]
    }
   ],
   "source": [
    "dftxt2 = pd.DataFrame(txtalt)\n",
    "print(dftxt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43ecd4fa-5a54-47b1-91fd-b7055aae031b",
   "metadata": {},
   "source": [
    "## Converting alternatives to NumPy's `ndarray`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d403032-9e08-47a7-a9dc-6b83ba7ccd73",
   "metadata": {},
   "source": [
    "A direct conversion of some columnized `alt` data is achieved by ``list(alt.values())``. This list can be used as an argument to create a `numpy.array`.\n",
    "The created array usually needs to be transposed using `array.T`. Depending on `alt` data items, specifying the array data type `dtype` might be necessary."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c149cc3-4314-4eb1-b095-d900bd0db0b3",
   "metadata": {},
   "source": [
    "To start with simple columnized data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "3cf09129-cb63-4231-8a56-68a07fd143b1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' '3' '2' '1' '2' '3' '2' '2' '2' '2' '2']\n",
      " ['Car2' '2' '1' '1' '1' '2' '2' '2' '2' '2' '1']]\n"
     ]
    }
   ],
   "source": [
    "npalt2 = np.array(list(colalt2.values())).T\n",
    "print(npalt2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "95ca2c67-5fae-4c6b-b809-0b7cdc0e409c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 3 2 1 2 3 2 2 2 2 2]\n",
      " ['Car2' 2 1 1 1 2 2 2 2 2 1]]\n"
     ]
    }
   ],
   "source": [
    "npalt2 = np.array(list(colalt2.values()), dtype = 'O').T\n",
    "print(npalt2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "99b3581e-c60a-4983-8466-22fe8ba2cb7e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 1.0 1.0 0.5 1.0 1.0 1.0 1.0 0.6666666666666666 1.0 1.0]\n",
      " ['Car2' 0.6666666666666666 0.5 0.5 0.5 0.6666666666666666 1.0 1.0\n",
      "  0.6666666666666666 1.0 0.5]\n",
      " ['Base' None None None 0.5 None None 1.0 0.6666666666666666 None None]\n",
      " ['Prob' 0.5 1.0 1.0 0.5 0.5555555555555556 1.0 1.0 0.6666666666666666\n",
      "  0.5 0.5]\n",
      " ['Base' 0.6666666666666666 1.0 1.0 0.5 0.5555555555555555 1.0 1.0\n",
      "  0.6666666666666666 0.5 0.5]]\n"
     ]
    }
   ],
   "source": [
    "npagr = np.array(list(colagr.values())).T\n",
    "print(npagr)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "08612364-6f4e-49f2-ac2c-24a7b5225cf5",
   "metadata": {},
   "source": [
    "For columnized data that contains value sets and distributions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "6e078502-301d-4f7f-aabf-cb2cba957212",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 3 2 1 2 3 2 2 2 2 2]\n",
      " ['Car2' 2 1 1 1 2 2 2 2 2 1]\n",
      " ['Base' None None 'low' 1 None None 2 2 'medium' '*']\n",
      " ['Prob' {0, 3} 2 2 1 {0, 2, 3} 2 2 2 1 {0, 1, 2}]\n",
      " ['Base' list([0.3333333333333333, 0.0, 0.0, 0.6666666666666666]) 2 2 1\n",
      "  list([0.3333333333333333, 0.0, 0.3333333333333333, 0.3333333333333333])\n",
      "  2 2 2 1 {0, 1, 2}]]\n"
     ]
    }
   ],
   "source": [
    "npalt = np.array(list(colalt.values()), dtype = 'O').T\n",
    "print(npalt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d46bd96-9d42-4d50-b9c4-abfc5c02a6f6",
   "metadata": {},
   "source": [
    "Conversion from textual data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "b98dad02-1f0c-4ffe-8055-03354db3a46a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 'exc' 'low' 'medium' 'low' 'exc' 'high' 'more' '4' 'big' 'high']\n",
      " ['Car2' 'good' 'medium' 'medium' 'medium' 'good' 'high' 'more' '4' 'big'\n",
      "  'medium']\n",
      " ['Base' 'None' 'None' 'low' 'medium' 'None' 'None' 'more' '4' 'medium'\n",
      "  '*']\n",
      " ['Prob' \"('unacc', 'exc')\" 'low' 'low' 'medium' \"('bad', 'good', 'exc')\"\n",
      "  'high' 'more' '4' 'medium' \"('small', 'medium', 'high')\"]\n",
      " ['Base' \"{'unacc': 0.33, 'exc': 0.67}\" 'low' 'low' 'medium'\n",
      "  \"{'bad': 0.33, 'good': 0.33, 'exc': 0.33}\" 'high' 'more' '4' 'medium'\n",
      "  \"('small', 'medium', 'high')\"]]\n"
     ]
    }
   ],
   "source": [
    "nptxt = np.array(list(coltxt.values())).T\n",
    "print(nptxt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cec1e63d-ba41-4be9-a3c3-3ecfa05945d7",
   "metadata": {},
   "source": [
    "Another option is to make conversions from Panda's data frames using `DataFrame.to_numpy`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "id": "7eb1b5cf-e0d2-4174-9bfa-c85a613ae3b4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 3 2.0 1 2 3 2.0 2 2 2 2]\n",
      " ['Car2' 2 1.0 1 1 2 2.0 2 2 2 1]\n",
      " ['Base' None nan 'low' 1 None nan 2 2 'medium' '*']\n",
      " ['Prob' {0, 3} 2.0 2 1 {0, 2, 3} 2.0 2 2 1 {0, 1, 2}]\n",
      " ['Base' list([0.3333333333333333, 0.0, 0.0, 0.6666666666666666]) 2.0 2 1\n",
      "  list([0.3333333333333333, 0.0, 0.3333333333333333, 0.3333333333333333])\n",
      "  2.0 2 2 1 {0, 1, 2}]]\n"
     ]
    }
   ],
   "source": [
    "npalt = dfalt.to_numpy()\n",
    "print(npalt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "id": "13d34ecc-c304-4d46-9bdf-f27179822716",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 1.0 1.0 0.5 1.0 1.0 1.0 1.0 0.6666666666666666 1.0 1.0]\n",
      " ['Car2' 0.6666666666666666 0.5 0.5 0.5 0.6666666666666666 1.0 1.0\n",
      "  0.6666666666666666 1.0 0.5]\n",
      " ['Base' nan nan nan 0.5 nan nan 1.0 0.6666666666666666 nan nan]\n",
      " ['Prob' 0.5 1.0 1.0 0.5 0.5555555555555556 1.0 1.0 0.6666666666666666\n",
      "  0.5 0.5]\n",
      " ['Base' 0.6666666666666666 1.0 1.0 0.5 0.5555555555555555 1.0 1.0\n",
      "  0.6666666666666666 0.5 0.5]]\n"
     ]
    }
   ],
   "source": [
    "npagr = dfagr.to_numpy()\n",
    "print(npagr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "id": "55f0fc4e-242d-4b19-9a36-99507f2c7873",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['Car1' 'exc' 'low' 'medium' 'low' 'exc' 'high' 'more' '4' 'big' 'high']\n",
      " ['Car2' 'good' 'medium' 'medium' 'medium' 'good' 'high' 'more' '4' 'big'\n",
      "  'medium']\n",
      " ['Base' 'None' 'None' 'low' 'medium' 'None' 'None' 'more' '4' 'medium'\n",
      "  '*']\n",
      " ['Prob' \"('unacc', 'exc')\" 'low' 'low' 'medium' \"('bad', 'good', 'exc')\"\n",
      "  'high' 'more' '4' 'medium' \"('small', 'medium', 'high')\"]\n",
      " ['Base' \"{'unacc': 0.33, 'exc': 0.67}\" 'low' 'low' 'medium'\n",
      "  \"{'bad': 0.33, 'good': 0.33, 'exc': 0.33}\" 'high' 'more' '4' 'medium'\n",
      "  \"('small', 'medium', 'high')\"]]\n"
     ]
    }
   ],
   "source": [
    "nptxt = dftxt.to_numpy()\n",
    "print(nptxt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8fe1c148-7502-4df8-85b4-27d94757a1f3",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ce0ceec-90e6-4506-863e-95336aaf79cf",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}