Kuro's Blog: 4月 2015

コマンドラインツール実装時、オプション指定とか引数の並びとか考え始めると大変です。HaskellではSystem.EnvironmentモジュールからgetArgsという関数が提供されていますが、本エントリで紹介するcmdargsパッケージを利用すると以下のようなことが簡単にできます。

cmdargsパッケージの特徴：

データ構造を定義するだけで起動引数・オプションのパースができる
パース結果を型付きで参照することができる
パース失敗時には、原因がわかるエラーメッセージが表示される
--help, --versionオプションで表示される情報を自動で生成してくれる

Haskell版GNU getoptライブラリと比べて以下の2点が優れている、とHPには書かれています。

HLintコマンドラインのハンドリングが1/3の短さ
Cabal, darcsなどのmultiple modeプログラムに対応

簡単なサンプル：Sample.hs

{-# LANGUAGE DeriveDataTypeable #-}
import System.Console.CmdArgs

data Sample = Sample {
      hello :: String
} deriving (Show, Data, Typeable)

sample = Sample {
           hello = def &= help "World argument" &= opt "world"}
         &= summary "This is a sample of CmdArgs."

main = print =<< cmdArgs sample

パース結果を格納するデータ型（Sample）を定義しています。この型はShow, Data, Typeableのインスタンスである必要があります。defは任意の型のデフォルト値で、この例（String）では""と同じです。&=以降はhelp用のアノテーションになります。

実行結果：Sample.hs

% runghc Sample.hs --help
This is a sample of CmdArgs.

sample [OPTIONS]

Common flags:
  -h --hello[=ITEM]  World argument
  -? --help          Display help message
  -V --version       Print version information

% runghc Sample.hs --hello
Sample {hello = "world"}

より複雑なパース：MyArgs.hs

以下の例は自作コマンドに利用したコードの一部です。複数のCSVファイル、もしくはCSVファイルが格納されたディレクトリを入力として与え、CSVファイル内のデータをdboptで指定したタイプのDB（ファイル名・コネクションは必須の第一引数）に格納するものでした。

{-# LANGUAGE DeriveDataTypeable #-}
import System.Console.CmdArgs

data DbOpt = SQLite | PostgreSQL | MySQL deriving (Data, Typeable, Eq, Show, Read)

data MyArgs = MyArgs {
      dbopt    :: DbOpt
    , schema   :: String
    , recursive :: String
    , targetdb :: String
    , csvfiles :: [String]
    , color :: Bool
    } deriving (Data,Typeable,Show)

help_dbopt = "specify DB type. default DB is 'sqlite'.\n"
             ++ "'postgresql', 'mysql' & etc. may be supported in the future."
help_schema = "specify table type.\n"
              ++ "specify predefined schema index such as 'd12', 'd13', etc.\n"
              ++ "default is 'normal' which stores all values as string.\n"
help_recursive = "specify directory to iterate all files in it recursively."
help_targetdb = "specify DB file/connection name."
help_csvfiles = "specify one ore more csv files.\n"
              ++ "one file must be specified at the minimum."
help_color = "enable colored output."
help_program = "parse CSV files and store all data into DB.\n"
               ++ "TARGET_DB is the mandatory argument."

config = MyArgs {
      dbopt   = SQLite &= typ "TARGET_DB_TYPE" &= help help_dbopt
    , schema  = "normal" &= typ "SCHEMA_INDEX" &= help help_schema
    , recursive = def &= typ "RECURSIVE_DIR" &= help help_recursive
    , targetdb  = def &= typ "TARGET_DB" &= argPos 0 -- &= help help_targetdb
    , csvfiles = def &= typ "CSV_FILES" &= args -- &= help help_csvfiles
    , color = def &= name "c" &= help help_color
} &= verbosity &= program "MyArgs" &= help help_program

main = print =<< cmdArgs config

実行結果：MyArgs.hs

# ヘルプ
% runghc MyArgs.hs --help
The MyArgs program

MyArgs [OPTIONS] TARGET_DB [CSV_FILES]
  parse CSV files and store all data into DB. TARGET_DB is the mandatory
  argument.

Common flags:
  -d --dbopt=TARGET_DB_TYPE     specify DB type. default DB is 'sqlite'.
                                'postgresql', 'mysql' & etc. may be supported
                                in the future.
  -s --schema=SCHEMA_INDEX      specify table type. specify predefined schema
                                index such as 'd12', 'd13', etc. default is
                                'normal' which stores all values as string.
  -r --recursive=RECURSIVE_DIR  specify directory to iterate all files in it
                                recursively.
  -c --color                    enable colored output.
  -? --help                     Display help message
  -V --version                  Print version information
  -v --verbose                  Loud verbosity
  -q --quiet                    Quiet verbosity

# エラーケース（dboptが不正）
% runghc MyArgs.hs -d invalid_dbopt
Could not read, expected one of: sqlite postgresql mysql
# エラーケース（引数不足）
% runghc MyArgs.hs 
Requires at least 1 arguments, got 0

# 正常系（複数csvファイル指定）
% runghc MyArgs.hs a.db 1.csv 2.csv
MyArgs {dbopt = SQLite, schema = "normal", recursive = "", targetdb = "a.db", csvfiles = ["1.csv","2.csv"]}
# 正常系（オプションでdbopt, recursiveを指定）
% runghc MyArgs.hs a.db -r data_dir -d MySQL
MyArgs {dbopt = MySQL, schema = "normal", recursive = "data_dir", targetdb = "a.db", csvfiles = []}

パースの結果得たMyArg型の値から、起動引数・オプションで指定した値を取り出すことができます。必須の第一引数が指定されていない場合、dboptで不正なDBタイプを指定した場合など、起動引数・オプションが不正である場合に、適切なエラーが出力されているのが確認できます。
(2016/02/11: Bool型のオプション--colorを追加。）

参考：

Windows上でHaskell Platformを完全削除する方法です。
LinuxやMac環境についてはネット上に多数情報がありますが、Windows環境についてはそれが見当たらなかったため、本エントリにまとめておきます。確認した環境はWindows 8.1＋Haslell Platform 2014.2.0.0です。

削除手順：

Windows環境では以下の手順でHaskell Platformを完全に削除できます。

Haskell Platformのアンインストール

[コントロールパネル] - [プログラム] - [プログラムと機能]を開く
"Haslell Platform 2014.2.0.0"を選択して[アンインストール]を実行

ユーザー領域に作成されたパッケージ関連ファイルの削除

手動（エクスプローラ、rmコマンドなど）で次の２つのディレクトリ以下を完全に削除

C:\Users\???\AppData\Roaming\ghc
C:\Users\???\AppData\Roaming\cabal

手動で全ての".cabal-sandbox"ディレクトリを削除する

Kuro's Blog

2015年4月25日土曜日

[haskell] cmdargsパッケージで楽々コマンドライン引数パース